-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd.read_excel() on the output of infer_schema.xlsx returns AssertionError: #4
Comments
Seems the error is from xlrd .. |
Hi @sugiantolauw, The errors roots from the fact that the infer_schema function generates data schema as an .xlsx file containing lots of unnecessary columns. Manually selecting and deleting all the remaining unused columns solved the issue for me. @SauceCat, |
@sugiantolauw @rohanneps
|
You can also solve this by saving the edited output workbook as a csv and then use pd.read_csv instead. |
@SauceCat Can you please let me know if you have a fix for this? |
you can change your code from test_schema = pd.read_excel('output/data_schema_test_schema.xlsx') to test_schema = pd.read_excel('output/data_schema_test_schema.xlsx', engine='openpyxl') |
Hi There,
This looks like a great package and I was testing your package for my own automation.
So, I have created the schema of my dataframe with the infer_schema() function and it returns the .xlsx file.
When I tried to read the .xlsx schema file using pd.read_excel() function, it returns "AssertionError: "
I am not sure what is happening here.
Here is the complete error:
Here is what I have installed on my python:
airflow==1.8.0
alembic==0.8.10
annoy==1.11.1
asn1crypto==0.24.0
attrs==17.4.0
Automat==0.6.0
awscli==1.14.32
basemap==1.0.7
bcolz==1.1.2
beautifulsoup4==4.6.0
bitarray==0.8.1
bleach==1.5.0
bokeh==0.12.14
boto==2.48.0
boto3==1.5.22
botocore==1.8.36
branca==0.2.0
bs4==0.0.1
bz2file==0.98
cachetools==2.0.1
Cartopy==0.15.1
catboost==0.6
certifi==2018.1.18
cffi==1.11.4
chardet==3.0.4
click==6.7
click-plugins==1.0.3
cliff==2.8.0
cligj==0.4.0
cloudpickle==0.5.2
cmd2==0.8.0
colorama==0.3.7
colorcet==1.0.0
configparser==3.5.0
constantly==15.1.0
croniter==0.3.20
cryptography==2.1.4
cssselect==1.0.3
cycler==0.10.0
cymem==1.31.2
Cython==0.27.3
cytoolz==0.8.2
dask==0.17.0
datashader==0.6.5
datashape==0.5.4
deap==1.2.2
decorator==4.2.1
descartes==1.1.0
dill==0.2.7.1
distributed==1.21.0
Django==1.11.8
docutils==0.14
en-core-web-sm==2.0.0
entrypoints==0.2.3
et-xmlfile==1.0.1
fastai==0.6
feather-format==0.4.0
Fiona==1.7.11.post1
Flask==0.11.1
Flask-Admin==1.4.1
Flask-Cache==0.13.1
Flask-Login==0.2.11
flask-swagger==0.2.13
Flask-WTF==0.12
folium==0.5.0
ftfy==4.4.3
funcsigs==1.0.0
future==0.15.2
gensim==3.2.0
geopandas==0.3.0
geoplot==0.1.2
gitdb2==2.0.3
GitPython==2.1.8
graphviz==0.8.2
gunicorn==19.3.0
h5py==2.7.1
heapdict==1.0.0
html5lib==0.9999999
hyperlink==17.3.1
idna==2.6
ijson==2.3
impyla==0.14.0
incremental==17.5.0
ipykernel==4.8.0
ipython==6.2.1
ipython-genutils==0.2.0
ipywidgets==7.1.1
isoweek==1.3.3
itsdangerous==0.24
JayDeBeApi==1.1.1
jdcal==1.3
jedi==0.11.1
Jinja2==2.8.1
jmespath==0.9.3
JPype1==0.6.2
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.2.2
jupyter-console==5.2.0
jupyter-contrib-core==0.3.3
jupyter-contrib-nbextensions==0.4.0
jupyter-core==4.4.0
jupyter-highlight-selected-word==0.1.0
jupyter-latex-envs==1.4.1
jupyter-nbextensions-configurator==0.4.0
kaggle-cli==0.12.13
Keras==2.1.3
ktext==0.31
lightgbm==2.1.0
llvmlite==0.21.0
locket==0.2.0
lockfile==0.12.2
lxml==3.8.0
Mako==1.0.7
Markdown==2.6.9
MarkupSafe==1.0
matplotlib==2.1.2
MechanicalSoup==0.8.0
missingno==0.4.0
mistune==0.8.3
mizani==0.4.4
more-itertools==4.1.0
msgpack==0.5.1
msgpack-numpy==0.4.1
msgpack-python==0.5.1
multipledispatch==0.4.9
multiprocess==0.70.5
munch==2.2.0
murmurhash==0.28.0
nbconvert==5.3.1
nbformat==4.4.0
networkx==2.1
nltk==3.2.5
nose==1.3.7
notebook==5.3.1
numba==0.36.2
numpy==1.14.0
olefile==0.45.1
opencv-python==3.4.0.12
openpyxl==2.5.0
ordereddict==1.1
packaging==16.8
palettable==3.1.0
pandas==0.22.0
pandas-summary==0.0.41
pandocfilters==1.4.2
param==1.5.1
parsel==1.3.1
parso==0.1.1
partd==0.3.8
pathlib==1.0.1
pathos==0.2.1
patsy==0.5.0
pbr==3.1.1
pexpect==4.3.1
pickleshare==0.7.4
Pillow==5.0.0
plac==0.9.6
plotly==2.3.0
plotnine==0.3.0
ply==3.10
pox==0.2.3
ppft==1.6.4.7.1
preshed==1.0.0
prettytable==0.7.2
progressbar2==3.34.3
prompt-toolkit==1.0.15
protobuf==3.4.1
psutil==5.4.3
ptyprocess==0.5.2
pyarrow==0.8.0
pyasn1==0.4.2
pyasn1-modules==0.2.1
pycparser==2.18
PyDispatcher==2.0.5
pydot==1.2.4
pydqc==0.1
pyemd==0.5.1
Pygments==2.2.0
pyodbc==4.0.22
pyOpenSSL==17.5.0
pyparsing==2.2.0
pyperclip==1.6.0
Pyphen==0.9.4
pyproj==1.9.5.1
pyshp==1.2.12
python-daemon==2.1.2
python-dateutil==2.6.1
python-editor==1.0.3
python-Levenshtein==0.12.0
python-nvd3==0.14.2
python-slugify==1.1.4
python-utils==2.2.0
pytz==2017.3
PyYAML==3.12
pyzmq==16.0.3
qtconsole==4.3.1
queuelib==1.4.2
regex==2017.4.5
requests==2.18.4
rsa==3.4.2
s3transfer==0.1.12
scikit-learn==0.19.1
scipy==1.0.0
Scrapy==1.5.0
seaborn==0.8.1
selenium==3.8.1
Send2Trash==1.4.2
service-identity==17.0.0
setproctitle==1.1.10
Shapely==1.6.4.post1
simplegeneric==0.8.1
six==1.11.0
sklearn-pandas==1.6.0
smart-open==1.5.6
smmap2==2.0.3
sortedcontainers==1.5.9
spacy==2.0.6
SQLAlchemy==1.2.2
statsmodels==0.8.0
stevedore==1.28.0
stopit==1.1.1
tabulate==0.7.7
tblib==1.3.2
tensorflow==1.4.1
tensorflow-tensorboard==0.1.5
teradata==15.10.0.21
termcolor==1.1.0
terminado==0.8.1
testpath==0.3.1
textacy==0.5.0
Theano==1.0.1
thinc==6.10.2
thrift==0.9.3
thriftpy==0.3.9
toolz==0.9.0
torchtext==0.2.1
tornado==4.5.3
TPOT==0.9.2
tqdm==4.19.5
traitlets==4.3.2
Twisted==17.9.0
ujson==1.35
Unidecode==1.0.22
update-checker==0.16
urllib3==1.22
w3lib==1.19.0
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.14.1
widgetsnbextension==3.1.0
wrapt==1.10.11
WTForms==2.1
xarray==0.10.0
xgboost==0.7
xlrd==1.1.0
XlsxWriter==1.0.2
zict==0.1.3
zope.deprecation==4.3.0
zope.interface==4.4.3
Let me know your thoughts and if you need more info.
Sugi
The text was updated successfully, but these errors were encountered: