Java使用Tabula在Python中读取PDF时出现错误。

huangapple 未分类评论42阅读模式
英文:

Java Error while reading pdf with Python using Tabula

问题

我已经安装了tabula库,用于使用Python将PDF读取到pandas数据框中。
但是当我运行以下代码时:

import tabula
df = tabula.read_pdf("sample1.pdf", pages='1')

我收到了异常信息:

SEVERE: Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed

我尝试过的解决方案:

  1. 重新安装Java JDK,并确保已将其添加到PATH中(通过java-version进行验证)。
  2. 从此链接安装了Java Advanced Imaging工具,并重新启动了系统。
  3. 使用pip install tabula-py卸载并安装了tabula。

如果我有所遗漏,请告诉我。

英文:

I have installed the tabula library for reading pdf into a pandas dataframe using python.
But when I run the code

import tabula
df=tabula.read_pdf("sample1.pdf",pages='1')

I get the Exception.

SEVERE: Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed

Solutions I have tried:

  1. Re-installing the Java JDK and ensuring it is added to path(verified with java-version)
  2. Installed the Java Advanced Imaging tools from this link and restarting my system
  3. Uninstalling and installing tabula with pip install tabula-py

Please let me know if I overlooked something.

huangapple
  • 本文由 发表于 2020年7月24日 20:33:02
  • 转载请务必保留本文链接:https://java.coder-hub.com/63073663.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定