英文:
Java Error while reading pdf with Python using Tabula
问题
我已经安装了tabula库,用于使用Python将PDF读取到pandas数据框中。
但是当我运行以下代码时:
import tabula
df = tabula.read_pdf("sample1.pdf", pages='1')
我收到了异常信息:
SEVERE: Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed
我尝试过的解决方案:
- 重新安装Java JDK,并确保已将其添加到PATH中(通过java-version进行验证)。
- 从此链接安装了Java Advanced Imaging工具,并重新启动了系统。
- 使用
pip install tabula-py
卸载并安装了tabula。
如果我有所遗漏,请告诉我。
英文:
I have installed the tabula library for reading pdf into a pandas dataframe using python.
But when I run the code
import tabula
df=tabula.read_pdf("sample1.pdf",pages='1')
I get the Exception.
SEVERE: Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed
Solutions I have tried:
- Re-installing the Java JDK and ensuring it is added to path(verified with java-version)
- Installed the Java Advanced Imaging tools from this link and restarting my system
- Uninstalling and installing tabula with
pip install tabula-py
Please let me know if I overlooked something.
专注分享java语言的经验与见解,让所有开发者获益!
评论