英文:
Is there any way to use tika library on pythonanywhere?
问题
我正在解决一个解析问题,并且已经在本地系统中使用了tika库来读取PDF文档。由于我现在要在Web上部署我的解析器,所以不允许在pythonanywhere服务器上使用tika库。我已经阅读到pythonanywhere不支持tika,但我仍然在服务器上导入并安装,而且没有出现错误。我已经被困在这个问题上好几天了。
2020-04-08 11:49:26,003: 正在检索http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server/1.24/tika-server-1.24.jar 到 /tmp/tika-server.jar。
2020-04-08 11:49:26,006: 在 /parser [POST] 上出现异常
跟踪 (Traceback) 最近一次的调用 (most recent call):
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/tika/tika.py",行 (line) 798,调用 (call) 中的 getRemoteJar
urlretrieve(urlOrPath, destPath)
...
...
urllib.error.HTTPError: HTTP错误 403: 被禁止访问
**没有匹配 (NO MATCH)**
在处理上述异常期间,另一个异常出现:
**没有匹配 (NO MATCH)**
Traceback (跟踪) 最近一次的调用 (most recent call):
...
...
urllib.error.HTTPError: HTTP错误 403: 被禁止访问
**没有匹配 (NO MATCH)**
**非常感谢任何解决方案!**
英文:
I'm working on a parsing problem and have used tika library in local system to read pdf documents. As now I'm deploying my parser on the web, I'm not allowed to use tika library on pythonanywhere server. I have read that pythonanywhere don't support tika but I'm import and installing anyway on server without error. I'm stuck with it from couple days now.
2020-04-08 11:49:26,003: Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server/1.24/tika-server-1.24.jar to /tmp/tika-server.jar.
2020-04-08 11:49:26,006: Exception on /parser [POST]
Traceback (most recent call last):
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/tika/tika.py", line 798, in getRemoteJar
urlretrieve(urlOrPath, destPath)
File "/usr/lib/python3.7/urllib/request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.7/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/usr/lib/python3.7/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
**NO MATCH**
During handling of the above exception, another exception occurred:
**NO MATCH**
Traceback (most recent call last):
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/mubasharnazar/mysite/server.py", line 1546, in hello
response = read_pdf(name.filename)
File "/home/mubasharnazar/mysite/server.py", line 115, in read_pdf
file_data = parser.from_file(file)
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/tika/parser.py", line 40, in from_file
output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions)
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/tika/tika.py", line 338, in parse1
rawResponse=rawResponse, requestOptions=requestOptions)
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/tika/tika.py", line 531, in callServer
serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path)
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/tika/tika.py", line 592, in checkTikaServer
getRemoteJar(tikaServerJar, jarPath)
File "/home/mubasharnazar/.virtualenvs/flaskk/lib/python3.7/site-packages/tika/tika.py", line 808, in getRemoteJar
urlretrieve(urlOrPath, destPath)
File "/usr/lib/python3.7/urllib/request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
File "/usr/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.7/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/usr/lib/python3.7/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
Any solution would be highly appreciated?
专注分享java语言的经验与见解,让所有开发者获益!
评论