michabbb/split_merged_pdf_prompt.txt

## split_merged_pdf_prompt.txt
You are tasked with analyzing a merged PDF containing multiple documents in German and creating a bash script to split and rename these documents. Follow these instructions carefully:

First, carefully examine the content of the merged PDF.

Now, follow these steps:

1. Analyze the PDF:
   - Identify all distinct documents within the merged PDF.
   - Ignore any blank pages (pages that contain no letters).
   - Determine the page range for each document.
   - Identify the creation or issue date for each document (format: YYYY-MM-DD).

2. Create a document breakdown:
   Provide a detailed list of all identified documents. For each document, include:
   - A short, meaningful, and descriptive German name.
   - The exact page range(s) that belong to that document.
   - The date when the document was created or issued (format: YYYY-MM-DD).

   Present this information in the following format:

   1. [German name]: Pages [X-Y], Date: [YYYY-MM-DD]
   2. [German name]: Pages [X-Y], Date: [YYYY-MM-DD]
   ...

3. Generate appropriate filenames:
   For each document, create a filename following these rules:
   - Use only lowercase letters.
   - Replace spaces with underscores (_).
   - Use short, descriptive German names.
   - Include the document's creation date at the end (format: YYYY-MM-DD).

4. Create a bash script:
   Write a bash script that uses pdftk to split the input PDF into individual documents. The script should:
   - Define the input PDF filename as "merged_documents.pdf".
   - Include variables for each output filename.
   - Use pdftk commands to split the PDF according to the identified page ranges.
   - Be ready to run without modifications.

   Present the script as markdown in the following format:

   #!/bin/bash

   # Input PDF file
   input_pdf="merged_documents.pdf"

   # Output files with dates
   [filename_variable]="[generated_filename].pdf"
   ...

   # Splitting PDF pages with pdftk
   pdftk "$input_pdf" cat [page_range] output "$[filename_variable]"
   ...

   echo "PDFs erfolgreich aufgeteilt."

5. Provide final verification:
   After completing the analysis and script creation, provide the following information:

   - Total number of pages in merged PDF: [specify]
   - Total number of pages accounted for: [specify]
   - Number of blank pages identified: [specify]
   - Confirmation that all pages are accounted for: [Yes/No]

Ensure that you:
- Analyze the PDF thoroughly, identifying all documents and ignoring blank pages.
- Double-check all page ranges for accuracy.
- Avoid grouping documents into generic categories; each document should be individually identified and named.
- Provide the detailed document breakdown before presenting the bash script.

Present your final output in this order:
1. Document breakdown
2. Bash script
3. Final verification
	You are tasked with analyzing a merged PDF containing multiple documents in German and creating a bash script to split and rename these documents. Follow these instructions carefully:

	First, carefully examine the content of the merged PDF.

	Now, follow these steps:

	1. Analyze the PDF:
	- Identify all distinct documents within the merged PDF.
	- Ignore any blank pages (pages that contain no letters).
	- Determine the page range for each document.
	- Identify the creation or issue date for each document (format: YYYY-MM-DD).

	2. Create a document breakdown:
	Provide a detailed list of all identified documents. For each document, include:
	- A short, meaningful, and descriptive German name.
	- The exact page range(s) that belong to that document.
	- The date when the document was created or issued (format: YYYY-MM-DD).

	Present this information in the following format:

	1. [German name]: Pages [X-Y], Date: [YYYY-MM-DD]
	2. [German name]: Pages [X-Y], Date: [YYYY-MM-DD]
	...

	3. Generate appropriate filenames:
	For each document, create a filename following these rules:
	- Use only lowercase letters.
	- Replace spaces with underscores (_).
	- Use short, descriptive German names.
	- Include the document's creation date at the end (format: YYYY-MM-DD).

	4. Create a bash script:
	Write a bash script that uses pdftk to split the input PDF into individual documents. The script should:
	- Define the input PDF filename as "merged_documents.pdf".
	- Include variables for each output filename.
	- Use pdftk commands to split the PDF according to the identified page ranges.
	- Be ready to run without modifications.

	Present the script as markdown in the following format:

	#!/bin/bash

	# Input PDF file
	input_pdf="merged_documents.pdf"

	# Output files with dates
	[filename_variable]="[generated_filename].pdf"
	...

	# Splitting PDF pages with pdftk
	pdftk "$input_pdf" cat [page_range] output "$[filename_variable]"
	...

	echo "PDFs erfolgreich aufgeteilt."

	5. Provide final verification:
	After completing the analysis and script creation, provide the following information:

	- Total number of pages in merged PDF: [specify]
	- Total number of pages accounted for: [specify]
	- Number of blank pages identified: [specify]
	- Confirmation that all pages are accounted for: [Yes/No]

	Ensure that you:
	- Analyze the PDF thoroughly, identifying all documents and ignoring blank pages.
	- Double-check all page ranges for accuracy.
	- Avoid grouping documents into generic categories; each document should be individually identified and named.
	- Provide the detailed document breakdown before presenting the bash script.

	Present your final output in this order:
	1. Document breakdown
	2. Bash script
	3. Final verification