Last active
November 24, 2021 11:52
-
-
Save arthurattwell/55d168c6b584e901b3f9eaa80ca97063 to your computer and use it in GitHub Desktop.
Split a markdown file into separate files on YAML frontmatter
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# That tells Linux to use a Bourne shell interpreter. | |
# Run this script from the current directory. (Required on OSX.) | |
cd -- "$(dirname "$0")" | |
# Don't echo these commands. | |
set +v | |
# Get the filename from the user. | |
echo "Enter the file name to split: " | |
read filename | |
echo 'Splitting' $filename ' ...' | |
# Replace the first of each set of YAML frontmatter | |
# with a string delimiter we can replace later. | |
# This is necessary because csplit can only search | |
# one line at a time. | |
perl -i -0pe 's/(^---$)(.+?)(^---$)/---thisisnotthreehyphens$2---/gms' $filename | |
# Get the number of splits we're going to do. | |
frontmatter=$(grep -c "^\-\-\-thisisnotthreehyphens$" "$filename") | |
echo 'Found' $frontmatter 'markdown docs.' | |
# Set the number of times we'll repeat the csplit. | |
# This should be one less than the number of frontmatters. | |
csplitRepeats=$(($frontmatter-1)) | |
# Split the file at the delimiter we just created. | |
csplit -ks $filename '/---thisisnotthreehyphens/' {$csplitRepeats} | |
# In the original file and the files we just created, | |
# remove the string we put there as a delimiter. | |
perl -i -pe 's/---thisisnotthreehyphens/---/' $filename | |
# Remove the first, blank file. (We assume there is | |
# nothing before our first '---' that we want to keep.) | |
rm xx00 | |
# Put back the '---' in the files we created, and rename them. | |
for file in xx* | |
do | |
perl -i -pe 's/---thisisnotthreehyphens/---/' $file | |
newFilename=${file/xx/}.md | |
mv "$file" "$newFilename" | |
echo 'Created' $newFilename | |
done | |
# All done! | |
echo 'Done!' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
README
At EBW, when we're converting book content into markdown, we start by creating one big markdown file that contains the entire manuscript. Then we add YAML frontmatter where each section or chapter starts. And then we split that file into separate files, one per section or chapter.
This script automates the splitting step. It will look for YAML frontmatter, and split the file there. YAML frontmatter is at least two lines of three hyphens:
usually with some metadata about the file between them:
To use the script:
Note: this is work in progress. Back up the file you run this on first!
chmod +x split.sh
(you only need to do this once)../split.sh
If you are on a Mac, changing the script's file extension from
.sh
to.command
will let you double-click the script to run it. The first time, you'll still have to give it permission to run, as described above.Troubleshooting: