Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting started error #1

Open
darenr opened this issue Jul 6, 2017 · 5 comments
Open

getting started error #1

darenr opened this issue Jul 6, 2017 · 5 comments
Labels

Comments

@darenr
Copy link

darenr commented Jul 6, 2017

In getting started (README) it says csvtohtml and I think you meant csvtotable

On a separate issue, first CSV I tried it on I get a char encoding issue. The CSV is not public so I can't attach it, but I'm sure you can find some that contain utf8 chars

Traceback (most recent call last):
  File "/usr/local/bin/csvtotable", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/cli.py", line 30, in cli
    delimiter=delimiter, quotechar=quotechar)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/convert.py", line 48, in convert
    for row in reader:
  File "/usr/local/lib/python2.7/dist-packages/backports/csv.py", line 394, in __next__
    lineobj = next(self.input_iter)
  File "/usr/lib/python2.7/codecs.py", line 314, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf4 in position 818: invalid continuation byte
@vividvilla
Copy link
Owner

Thanks for reporting the issue. Can you please give more details like which Python version you are using and the OS. I have tested CSV data with utf8 chars in both Python 2.7.10, 3.4 and 3.5 and it seems fine.

@darenr
Copy link
Author

darenr commented Jul 6, 2017

python 2.7.13 on Ubuntu 16.10. I will try to extract a few lines of the cvs that reproduce the problem. In my experience character encoding issues are 49% of all python bugs, 49% are off-by-one and the remaining 2% all others :)

@vividvilla
Copy link
Owner

Indeed :) character encoding issues are pain in the ass. I have replaced dependency backports.csv with unicodecsv which handles UTF-8 encoded CSV data better. Try upgrading the package to v1.1.0 and check if issue is fixed.

@darenr
Copy link
Author

darenr commented Jul 6, 2017

Successfully installed csvtotable-1.1.0

Unfortunately:

drace@drace:~/Desktop$ csvtotable Worker_Salary_0224.csv worker.html
File (worker.html) already exists. Do you want to overwrite? (y/n): 
Traceback (most recent call last):
  File "/usr/local/bin/csvtotable", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/cli.py", line 30, in cli
    delimiter=delimiter, quotechar=quotechar)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/convert.py", line 54, in convert
    for row in reader:
  File "/usr/local/lib/python2.7/dist-packages/unicodecsv/py2.py", line 128, in next
    for value in row]
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf4 in position 6: invalid continuation byte

The reason might be this:

drace@drace:~/Desktop$ file Worker_Salary_0224.csv 
Worker_Salary_0224.csv: ISO-8859 text, with very long lines, with CRLF line terminators

Linux reports that the file is in ISO-8859 encoding (not utf-8)

I can anonymize the data if that's useful, but you should be able to take a csv file you have and use iconv to convert it's encoding.

@vividvilla vividvilla added the bug label Jul 13, 2017
@chfw
Copy link

chfw commented Jul 14, 2017

try this:

pyexcel transcode --csv-source-encoding iso-8859-1 Worker_Salary_0224.csv Worker_Salary.sortable.html

and you will need:

$ pip install pyexcel pyexcel-cli pyexcel-sortable

And pyexcel-sortable wraps csvtotable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants