英文:
How can I get PDFTextStripper to extract text row by row?
问题
以下是您要翻译的内容:
这是输入(PDF)的摘录:
这是我的代码:
public static String pdfPageToText(
PDDocument docIn,
int pageNumber
) {
String pageText = "";
try {
PDFTextStripper stripper = new PDFTextStripper( );
stripper.setStartPage( pageNumber );
stripper.setEndPage( pageNumber );
pageText = stripper.getText( docIn );
} catch ( Exception e ) {
LOGGER.severe( e.getMessage( ) );
}
return pageText;
}
提取的文本看起来像这样:
我希望它更像这样:
请指引我正确的方向。谢谢。
英文:
Here is an excerpt of the input (PDF):
Here is my code:
public static String pdfPageToText(
PDDocument docIn,
int pageNumber
) {
String pageText = "";
try {
PDFTextStripper stripper = new PDFTextStripper( );
stripper.setStartPage( pageNumber );
stripper.setEndPage( pageNumber );
pageText = stripper.getText( docIn );
} catch ( Exception e ) {
LOGGER.severe( e.getMessage( ) );
}
return pageText;
}
The extracted text looks like this:
I would expect it to be more like this:
Please point me in the right direction. Thank you.
专注分享java语言的经验与见解,让所有开发者获益!
评论