Function only reading filtered rows. How do I read all rows from a sheet? #246

abhi250372 · 2022-12-15T01:48:56Z

Here is my code -
filepath = os.path.join(FolderPath, FileName)
a = load_workbook(filepath, read_only=True)

def read_excel_file(path: str, sheet_index: int) -> pd.DataFrame:
buffer = StringIO()
Xlsx2csv(path, outputencoding="utf8").convert(buffer,sheetid=sheet_index)
buffer.seek(0)
df = pd.read_csv(buffer)
return df

dfinal=read_excel_file(filepath,a.index(a.get_sheet_by_name('Sheetname'))+1)

abhi250372 · 2022-12-15T01:52:10Z

I am using XLSX2CSV because it is faster in reading excel files as compared to read_excel() from openpyxl. But using XLSX2CSV only reads filtered rows of a sheet whereas read_excel reads all rows regardless if the sheet is filtered or not.

tongngo · 2022-12-21T17:21:23Z

Hello, I have just experimented it also. Is there a way/option that XLSX2CSV would read all rows of a sheet ?

serene-dev · 2022-12-21T18:00:18Z

what is a filtered row? can you share a sample xlsx file?

tongngo · 2022-12-21T18:08:21Z

In the attached file, ColumnA is filtering rows only showing 'A' value in ColumnA (not showing the rows beginning with 'B' value)
The xlsx2csv would only convert the visible rows, not all rows as Excel would do.
Samplefile.xlsx

abhi250372 · 2022-12-21T23:51:46Z

Here is another example. The openpyxl function returns all rows regardless if the sheet is pre filtered or not but the XLSX2CSV only returns the pre filtered rows. Abhishek

…

On Wed, 21 Dec 2022 at 12:08, TongsasTong ***@***.***> wrote: In the attached file, ColumnA is filtering rows only showing 'A' value in ColumnA (not showing the rows beginning with 'B' value) The xlsx2csv would only convert the visible rows, not all rows as Excel would do. Samplefile.xlsx <https://github.com/dilshod/xlsx2csv/files/10280128/Samplefile.xlsx> — Reply to this email directly, view it on GitHub <#246 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A4YGWD75OHFZGECCBFZEN7TWONBR7ANCNFSM6AAAAAAS7F2HWU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

abhi250372 · 2022-12-22T14:52:13Z

Here is sample file Car.xlsx

abhi250372 · 2022-12-23T23:06:22Z

Looks like dilshod has fixed the issue. I downloaded the xlsx2csvc.py file in the repository and replaced the old file in my local machine with this new one. I added the skip_hidden_rows as False in my function mentioned above and now it works. However, I need to know when this version will be available via pip install command so that all the users can get the updated version instead of manually downloading this file. Can someone please help with that? def read_excel_file(path: str, sheet_index: int) -> pd.DataFrame:
buffer = StringIO()
Xlsx2csv(path, outputencoding="utf-8",skip_hidden_rows = False).convert(buffer,sheetid = sheet_index)
buffer.seek(0)
df = pd.read_csv(buffer)
return df

tongngo · 2022-12-26T16:07:35Z

Looks like dilshod has fixed the issue. I downloaded the xlsx2csvc.py file in the repository and replaced the old file in my local machine with this new one. I added the skip_hidden_rows as False in my function mentioned above and now it works. However, I need to know when this version will be available via pip install command so that all the users can get the updated version instead of manually downloading this file. Can someone please help with that? def read_excel_file(path: str, sheet_index: int) -> pd.DataFrame: buffer = StringIO() Xlsx2csv(path, outputencoding="utf-8",skip_hidden_rows = False).convert(buffer,sheetid = sheet_index) buffer.seek(0) df = pd.read_csv(buffer) return df

Thank you again @dilshod !!

abhi250372 · 2022-12-29T18:43:59Z

@dilshod When will the latest version be released so that I can do pip install?

abhi250372 · 2023-01-07T00:27:44Z

@dilshod is there an update on this?

hendrikschafer · 2023-05-26T13:25:30Z

It is available via pip now, this issue can be closed!

mcrumiller · 2023-06-27T13:43:14Z

I think skip_hidden_rows=False should be the default. Nobody expects filtered rows to be removed when reading an Excel.

tongngo · 2023-06-27T13:48:18Z

I think skip_hidden_rows=False should be the default. Nobody expects filtered rows to be removed when reading an Excel.

Yes, also, this is the default beahvior of excel when saviong to csv file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function only reading filtered rows. How do I read all rows from a sheet? #246

Function only reading filtered rows. How do I read all rows from a sheet? #246

abhi250372 commented Dec 15, 2022

abhi250372 commented Dec 15, 2022

tongngo commented Dec 21, 2022

serene-dev commented Dec 21, 2022

tongngo commented Dec 21, 2022

abhi250372 commented Dec 21, 2022 via email

abhi250372 commented Dec 22, 2022

abhi250372 commented Dec 23, 2022

tongngo commented Dec 26, 2022

abhi250372 commented Dec 29, 2022

abhi250372 commented Jan 7, 2023

hendrikschafer commented May 26, 2023

mcrumiller commented Jun 27, 2023 •

edited

Loading

tongngo commented Jun 27, 2023

Function only reading filtered rows. How do I read all rows from a sheet? #246

Function only reading filtered rows. How do I read all rows from a sheet? #246

Comments

abhi250372 commented Dec 15, 2022

abhi250372 commented Dec 15, 2022

tongngo commented Dec 21, 2022

serene-dev commented Dec 21, 2022

tongngo commented Dec 21, 2022

abhi250372 commented Dec 21, 2022 via email

abhi250372 commented Dec 22, 2022

abhi250372 commented Dec 23, 2022

tongngo commented Dec 26, 2022

abhi250372 commented Dec 29, 2022

abhi250372 commented Jan 7, 2023

hendrikschafer commented May 26, 2023

mcrumiller commented Jun 27, 2023 • edited Loading

tongngo commented Jun 27, 2023

mcrumiller commented Jun 27, 2023 •

edited

Loading