Skip to content

Instantly share code, notes, and snippets.

@arthurattwell
Last active November 24, 2021 11:52
Show Gist options
  • Save arthurattwell/55d168c6b584e901b3f9eaa80ca97063 to your computer and use it in GitHub Desktop.
Save arthurattwell/55d168c6b584e901b3f9eaa80ca97063 to your computer and use it in GitHub Desktop.
Split a markdown file into separate files on YAML frontmatter
#!/bin/bash
# That tells Linux to use a Bourne shell interpreter.
# Run this script from the current directory. (Required on OSX.)
cd -- "$(dirname "$0")"
# Don't echo these commands.
set +v
# Get the filename from the user.
echo "Enter the file name to split: "
read filename
echo 'Splitting' $filename ' ...'
# Replace the first of each set of YAML frontmatter
# with a string delimiter we can replace later.
# This is necessary because csplit can only search
# one line at a time.
perl -i -0pe 's/(^---$)(.+?)(^---$)/---thisisnotthreehyphens$2---/gms' $filename
# Get the number of splits we're going to do.
frontmatter=$(grep -c "^\-\-\-thisisnotthreehyphens$" "$filename")
echo 'Found' $frontmatter 'markdown docs.'
# Set the number of times we'll repeat the csplit.
# This should be one less than the number of frontmatters.
csplitRepeats=$(($frontmatter-1))
# Split the file at the delimiter we just created.
csplit -ks $filename '/---thisisnotthreehyphens/' {$csplitRepeats}
# In the original file and the files we just created,
# remove the string we put there as a delimiter.
perl -i -pe 's/---thisisnotthreehyphens/---/' $filename
# Remove the first, blank file. (We assume there is
# nothing before our first '---' that we want to keep.)
rm xx00
# Put back the '---' in the files we created, and rename them.
for file in xx*
do
perl -i -pe 's/---thisisnotthreehyphens/---/' $file
newFilename=${file/xx/}.md
mv "$file" "$newFilename"
echo 'Created' $newFilename
done
# All done!
echo 'Done!'
@arthurattwell
Copy link
Author

arthurattwell commented Nov 16, 2018

README

At EBW, when we're converting book content into markdown, we start by creating one big markdown file that contains the entire manuscript. Then we add YAML frontmatter where each section or chapter starts. And then we split that file into separate files, one per section or chapter.

This script automates the splitting step. It will look for YAML frontmatter, and split the file there. YAML frontmatter is at least two lines of three hyphens:

---
---

usually with some metadata about the file between them:

---
title: "Chapter One"
---

To use the script:

Note: this is work in progress. Back up the file you run this on first!

  1. Save the script to the same folder as the big markdown file you want to split.
  2. In a Terminal, in that folder, give permission to run the script by entering: chmod +x split.sh (you only need to do this once).
  3. Then also in that Terminal, run the script by entering ./split.sh

If you are on a Mac, changing the script's file extension from .sh to .command will let you double-click the script to run it. The first time, you'll still have to give it permission to run, as described above.

Troubleshooting:

  • If the script finds no markdown docs, your file may have Windows line endings. You can change these in a good text editor. In Sublime Text, go to View > Line endings > Unix. Save the file, and try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment