Deleting rows only if any cell in the worksheet is blank (using a script)

2

I have a calc (libreoffice) / Excel spreadsheet (it's actually a text file I opened as a spreadsheet for easy viewing) that contains some blank cells (not the entire line):

I would like if some cell on the right is blank, the entire row is deleted. It's a few thousand lines, so it's not feasible to do it manually.

    
asked by anonymous 28.05.2017 / 17:04

2 answers

3

I did a test like this, to simulate your cells in white:

>>> x = [1491828000,1491828180,1491828360,1491828540,1491828720,1491828900,1491829080,1491829260,1491829440,1491829620,1491829800,1491829980,1491830160]
>>> y = ["1", "2", "3", "", "5", "", "", "", "9", "10", "", "12", ""]

And I generated a CSV like this:

with open('teste.csv', 'a+') as f:
    for i in range(0, len(x)):
            if y[i]: f.write(str(x[i]) + "," + str(y[i]) + "\n")

It looks like this:

    
28.05.2017 / 20:07
3

We want to select only rows that have two cells filled in a csv. For this, we can use the following regular expression:

^..*,..*$

I'm using simple regular expressions, so that even processors that do not implement the + metacharacters can understand. Compatible with grep and sed .

Using ..* has the same result as .+ : you must have at least one character, it can not be the empty string. So we ensure that both columns have value.

If you use a column separator other than , (as in other DSV s), only change the separation character. This solution does not guarantee generalized cells, where the separated can be within escaped quotes.

    
28.05.2017 / 21:53