On each profile, ensure that the `blacklist` section is right above the
`include disable` section.
See etc/templates/profile.template.
Misc: This appears to affect about a third of the profiles that contain
`blacklist` entries:
$ git grep -El '^#?blacklist ' -- etc/profile* | wc -l
158
$ git diff --name-only f1381b342 | wc -l
49
Kind of relates to commit 04efbb276 ("profiles: replace x11 socket
blacklist with disable-X11.inc", 2024-03-22) / PR #6286.
As reported by @kmille[1]:
The current `tesseract` profile breaks `ocrmypdf`:
kmille@linbox:scans ocrmypdf C.pdf del.pdf
Scanning contents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1/1 0:00:00
1 Error, could not create hOCR output file: No such file or directory tesseract.py:253
1 Error, could not create TXT output file: No such file or directory tesseract.py:253
OCR ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/1 -:--:--
An exception occurred while executing the pipeline _common.py:294
Traceback (most recent call last):
File "/usr/lib/python3.12/site-packages/ocrmypdf/_pipelines/_common.py", line 259, in
cli_exception_handler
return fn(options, plugin_manager)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
[...]
File "/usr/lib/python3.12/pathlib.py", line 840, in stat
return os.stat(self, follow_symlinks=follow_symlinks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ocrmypdf.io.0od81kk5/000001_ocr_hocr.hocr'
These are some of the commands that run in background:
[...]
2024/11/23 22:13:53 PID=403915 UID=0 CMD=/usr/bin/firejail /usr/bin/tesseract --list-langs
2024/11/23 22:13:53 PID=403917 UID=0 CMD=/run/firejail/lib/fcopy /usr/bin/text2image /run/firejail/mnt/bin
2024/11/23 22:13:53 PID=403939 UID=1000 CMD=gs -dQUIET [...] -f /tmp/ocrmypdf.io.0od81kk5/origin.pdf
[...]
2024/11/23 22:14:03 PID=403953 UID=0 CMD=tesseract -l eng /tmp/ocrmypdf.io.0od81kk5/000001_ocr.png [...]
Fixes#6550.
[1] https://github.com/netblue30/firejail/issues/6550#issue-2686607038
Reported-by: @kmille
Suggested-by: @kmille
Tesseract is a CLI program and its output may be parsed by other
programs (such as `ocrmypdf`). Including messages from firejail in the
output may break the parsing, so remove them.
Fixes#6171.
Reported-by: @kmille
* Add firecfg support for tesseract
* Add tesseract to 'New profiles' section in README.md
* Create tesseract.profile
* tesseract: fix private-etc
* tesseract: fix XDG black/whitelisting
* tesseract: use 'seccomp socket' instead of 'protocol unix'
As kindly suggested by @rusty-snake.
* tesseract: add 'restrict-namespaces'
As kindly suggested by @rusty-snake.
* tesseract: use full seccomp filtering
The tesseract application works fine without 'protocol' or 'seccomp socket'.