2014-03-11

Parsing fountain files

Fountain is a markup language for screenplays and stageplays, developed by John August, Nima Yousefi, Stu Maschwitz, and others.

How can you get information from a fountain file, such as a character list? One way is to convert it to another format and use a commercial product (such as Final Draft) or import it into Trelby (which is free) and use the character report feature.

This post just gives some simple Python functions on how to do this. All code mentioned here is copyright of myself and licensed under the modified BSD license. (Actually, the code for the character graph requires Sage, a free mathematical software system which is based on Python.)

First, let's take an example of a fountain document, such as John August's Big Fish (2003) script. (It is directed by Tim Burton, and I highly recommend it, if you haven't seen it already.)

The screenplay, in fountain and as a pdf, can be downloaded (free) from fountain.io.


The first function simply returns the list of scenes (optionally, with line numbers).

def scene_list(input_ftn_file, line_numbers = False):
    """
    
    INPUT:
       input_ftn_file could be "/path/my-script.fountain"

    OUTPUT:
       list of scenes occurring in script

    """
    f = open(input_ftn_file)
    lines = f.readlines()
    scene_lst = []
    for j in range(len(lines)):
        x = lines[j]
        if x[:3].upper() == "INT" or x[:3].upper() == "EXT":
            if line_numbers == True:
                scene_lst.append([j,x[:-1]])
            else:
                scene_lst.append(x[:-1])
    f.close()
    return scene_lst
For example, Big Fish has 194 scenes and we can see a partial list (with the line numbers in the script) as follows:

>>> scene_list(input_ftn_file, line_numbers = True)
[[30, "INT.  WILL'S BEDROOM - NIGHT (1973)"],
 [39, 'EXT.  CAMPFIRE - NIGHT (1977)'],
 [62, 'INT.  BLOOM FRONT HALL - NIGHT (1987)'],
 [snip]
 [4623, 'EXT.  RIVER / UNDERWATER - DAY']]
>>> len(scene_list(input_ftn_file, line_numbers = True))
194

The next function simply (tries) to return a complete character list:

def character_list(input_ftn_file):
    """
    
    INPUT:
       input_ftn_file could be "/home/wdj/my-script.fountain"

    OUTPUT:
       list of characters with speaking parts occurring in script

    """
    f = open(input_ftn_file,'r')
    lines = f.readlines()
    char_list = []
    N = len(lines)
    for j in range(1,N-1):
        x = lines[j]
        if lines[j-1] == "\n" and lines[j+1] != "\n" and x.isupper():
            char = x[:-1]
            if "(" in char and ")" in char:
                i1 = char.index("(")
                i2 = char.index(")")
                char = char[:i1]
            if char[-1] == " ":
                char = char[:-1]
            if "\xc2\xa0" in char:
                char = char.replace("\xc2\xa0","")
            if "\xc2" in char:
                char = char.replace("\xc2","")
            if "\xa0" in char:
                char = char.replace("\xa0","")
            if not(char in char_list):
                char_list.append(char)
    f.close()
    return char_list


In the case of Big Fish, we get the following list:

>>> character_list(input_ftn_file)
['EDWARD',
 'LITTLE BRAVE',
 'WILL',
 "WILL'S DATE",
 'EDWARD AND WILL',
 'SANDRA',
 'YOUNG DR. BENNETT',
 'JOSEPHINE',
 'ZACKY',
 'DON PRICE',
 'WILBUR FREELY',
 'RUTHIE',
 'ADULT EDWARD',
 'DR. BENNETT',
 'YOUNG WILL',
 'YOUNG EDWARD',
 'GIRL',
 'SHARECROPPER',
 'LITTLE GIRL',
 'HOT-BLOODED SHOTGUN TOTER',
 'MAYOR',
 'SOME FARMER',
 'SHEPHARD',
 'A VOICE',
 'VOICE',
 'KARL',
 'VARIOUS TOWNFOLK',
 "MAN'S VOICE",
 'BEAMEN',
 'MILDRED',
 'NORTHER WINSLOW',
 "A GIRL'S VOICE",
 'JENNY',
 'A DEEP VOICE',
 'CASHIER',
 'AMOS',
 "A MAN'S VOICE",
 'JUMP LEADER',
 'PING',
 'JING',
 'THE MAN',
 'TELLER WOMAN',
 "WOMAN'S VOICE",
 'STUDENT',
 'NURSE',
 'THE CROWD',
 'SON',
 'KID']

There are 48 characters in this list.

What if you want to know what scenes a given character occurs in? You can use the following function.

def character_scene_list(input_ftn_file, name):
    """
    
    INPUT:
       input_ftn_file could be "/home/wdj/my-script.fountain"
       name could be "MARY" 

    OUTPUT:
       list of scenes that character has speaking parts in

    """
    f = open(input_ftn_file)
    lines = f.readlines()
    char_list = character_list(input_ftn_file)
    scene_lst = scene_list(input_ftn_file, line_numbers = True)
    scene_indices = [x[0] for x in scene_lst]
    char_scene_list = []
    number_of_scenes = len(scene_lst)
    for j in range(number_of_scenes-1):
        for k in range(scene_indices[j],scene_indices[j+1]):
             if name in lines[k]:
                 char_scene_list.append(lines[scene_indices[j]][:-1])
                 break
    f.close()
    return char_scene_list

In the case of Big Fish, we have the following example for one of the main characters, Edward:

>>> character_scene_list(input_ftn_file, "EDWARD")
["INT.  WILL'S BEDROOM - NIGHT (1973)",
 'EXT.  CAMPFIRE - NIGHT (1977)',
 'INT.  BLOOM FRONT HALL - NIGHT (1987)',
 'INT.  TINY PARIS RESTAURANT (LA RUE 14\xc2\xb0) - NIGHT (1998)',
 'EXT.  OUTSIDE LA RUE 14\xc2\xb0 - NIGHT',
 'EXT.  RIVER - DAY',
 'INT.  747 / FLYING - NIGHT',
 'INT.  BLOOM HOUSE - NIGHT  ',
 'EXT.  FIELD AT THE SWAMP EDGE - NIGHT',
 'EXT.  A CREEPY OLD HOUSE - NIGHT  ',
 'EXT. APPROACHING THE HOUSE',
 'EXT. BACK AT THE GATE - NIGHT  ',
 "EXT.  AT THE OLD WOMAN'S DOOR - NIGHT",
 'INT.  GUEST ROOM - DAY',
 "INT.  WILL'S BEDROOM - DAY  [FLASHBACK]",
 'INT.  TINY CHURCH - DAY  ',
 "INT.  YOUNG EDWARD'S BEDROOM - DAY  ",
 'EXT.  BASEBALL FIELD - DAY  ',
 'EXT.  GRADUATION STAGE - DAY  ',
 'EXT.  COURT HOUSE - DAY',
 'EXT.  HILL OUTSIDE ASHTON - DAY',
 'EXT.  MAIN STREET OF ASHTON - DAY',
 'EXT.  ROAD - DAY',
 'EXT. FURTHER ALONG - ROUGH PATH',
 'EXT.  THE DARK FOREST - DAY [LATER]',
 'EXT.  THE TOWN OF SPECTRE   - DAY',
 "INT.   BEAMEN'S HOUSE - DAY",
 'EXT.  TOWN / MAIN STREET - DAY  ',
 'EXT.  UNDER A TREE - DUSK  ',
 'EXT.  BY THE RIVER - NIGHT',
 'EXT. BY THE RIVER - NIGHT - CONTINUOUS',
 'EXT.  PATH BACK TO TOWN - NIGHT',
 'EXT.  MAIN STREET - NIGHT',
 'EXT.  THE DARK FOREST - NIGHT',
 'EXT.  THE ROAD - DAY',
 'INT.  DINING ROOM - NIGHT ',
 'INT.  GUEST BEDROOM - NIGHT',
 'EXT.  OLYMPIA CIRCUS - NIGHT',
 'INT.  BIG-TOP - NIGHT / LATER',
 'INT. BIG TOP - NIGHT - CONTINUOUS',
 'INT.  BIG-TOP - NIGHT',
 'INT.  BIG TOP CENTER RING - NIGHT',
 'EXT.  THE HYDRA - DAY',
 'EXT.  BEHIND A TENT - DAY',
 'INT.  STABLES - DAY',
 'INT.  A DARK PLACE - NIGHT  ',
 'INT.  STABLES - NIGHT',
 "EXT.  AMOS CALLOWAY'S TRAILER - NIGHT",
 'INT.  WOODS - DAWN',
 'EXT.  BIG TOP - DAY  ',
 'EXT.  SORORITY HOUSE - DAY',
 'EXT/INT. SORORITY HOUSE - THE DOORWAY',
 'EXT.  SORORITY HOUSE - DAY',
 'INT.  FRATERNITY HOUSE BATHROOM - DAY [FLASHFORWARD]',
 'EXT.  THE SORORITY HOUSE - DAY',
 'INT.  GUEST ROOM - NIGHT [PRESENT]',
 'INT.  UPSTAIRS HALLWAY - NIGHT [CONTINUOUS]',
 'INTERCUT HALLWAY / BEDROOM',
 'INT.  HOSPITAL - DAY',
 'INT.  ARMY AIRPLANE - NIGHT',
 'EXT. ON STAGE',
 'INT.  DRESSING ROOM - NIGHT',
 'EXT.  TEMPLETON FAMILY HOUSE - DAY',
 'EXT.  BEHIND THE TEMPLETON HOUSE - DAY',
 'INT.  GUEST ROOM - DAY',
 'INT.  BASEMENT STORAGE AREA - DAY [LATER]',
 'INT.  DOWNTOWN OFFICE - DAY [STORY]',
 'EXT.  COUNTY FAIR - DAY  [STORY] ',
 'EXT.  A COUNTRY ROAD - DAY',
 'EXT.  TRAILER PARK - DAY  ',
 'INT.  HORIZON SAVINGS & LOAN - DAY ',
 'INT.  AT THE VAULT - DAY ',
 'INT.  THE VAULT - DAY  ',
 "INT.  EDWARD'S CAR - DAY  ",
 'EXT.  TEXAS ROAD - DAY ',
 "EXT.  BLOOM HOUSE [MID/LATE '70'S] - DAY  ",
 'INT.  BLOOM HOUSE BATHROOM - DAY [PRESENT] ',
 "INT.  EDWARD'S CAR / DRIVING - NIGHT ",
 "INT. EDWARD'S CAR - NIGHT - [THE STORM]",
 "EXT.  EDWARD'S CAR - NIGHT ",
 'EXT.  SPECTRE - DAY',
 'INT.  SHACK - DAY',
 'INT.  SHACK - DAY  ',
 'EXT/INT.  SWAMP SHACK - DAY',
 "INT.  JENNY'S HOUSE - DAY",
 'INT.  HOSPITAL ROOM - PRE-DAWN  ',
 'INT.  HOSPITAL ROOM - DAY [STORY VERSION]',
 'INT.  HOSPITAL HALLWAY - DAY',
 'EXT.  PARKING LOT - DAY',
 'INT.  CHEVY - DAY',
 'EXT.  ASHTON RIVER - DAY',
 'INT.  THE CHEVROLET - DAY',
 'EXT.  RIVERSIDE - DAY',
 'INT.  HOSPITAL ROOM - DAY']

Next, I explain the code which will create a graph connecting characters with speaking parts in the same scene, so you can see graphically the relationships between the characters. The character graph of a script is the graph whose vertices are the characters with speaking parts and the edges consist of pairs of actors which have speaking lines in the same scene. (Optionally, one can weight or label that edge with the scene number.)

Here is the code I wrote to create a graph connecting characters with speaking parts in the same scene:

def character_graph_simple(input_ftn_file, min_max_scene = [0, 1000]):
    """
    Returns the graph whose vertices are the characters with speaking parts and 
    the edges consist of pairs of actors which have speaking lines in the same scene.
    (Assumes there are no more that 1000 scenes:-)

    """
    L = character_list(input_ftn_file)
    N = len(L)
    A = [[0 for i in range(N)] for j in range(N)]
    for i in range(N):
       for j in range(i,N):
           ans = is_character_in_same_scene(input_ftn_file, L[i], L[j])
           if ans[0] and ZZ(ans[1])>= min_max_scene[0] and ZZ(ans[1])<= min_max_scene[1]:
               A[i][j] = ZZ(1)
               A[j][i] = ZZ(1)
    for i in range(N):
               A[i][i] = ZZ(0)
    vertices = [[i,L[i]] for i in range(N)]
    return Graph(Matrix(A), format='adjacency_matrix'), Matrix(A), vertices
As you can see, this depends on the following utility function:
def is_character_in_same_scene(input_ftn_file, name1, name2):
     """
     Returns True if name1 and name2 are in the same scene and each have a speaking part.
     Otherwise, returns False.

     """
     L = characters_in_same_scene(input_ftn_file)
     LL = [x for x in L if len(x)>1]
     chars = char_list = character_list(input_ftn_file)
     if not(name1 in chars) or not(name2 in chars):
         return False
     i = chars.index(name1)
     j = chars.index(name2)
     #print i,j
     for x in LL:
        if i in x and j in x:
            return True,x[0][0]
     return False,0
Again, using John August's Big Fish as an example, we see that the character graph (whose vertices are the characters with speaking parts, and whose edges consist of pairs of actors which have speaking lines in the same scene) is:
In case that is hard to read, here is another version of the same egraph:
The table which lists the characters and their associated vertex numbers is:
[[0, 'EDWARD'],
 [1, 'LITTLE BRAVE'],
 [2, 'WILL'],
 [3, "WILL'S DATE"],
 [4, 'EDWARD AND WILL'],
 [5, 'SANDRA'],
 [6, 'YOUNG DR. BENNETT'],
 [7, 'JOSEPHINE'],
 [8, 'ZACKY'],
 [9, 'DON PRICE'],
 [10, 'WILBUR FREELY'],
 [11, 'RUTHIE'],
 [12, 'ADULT EDWARD'],
 [13, 'DR. BENNETT'],
 [14, 'YOUNG WILL'],
 [15, 'YOUNG EDWARD'],
 [16, 'GIRL'],
 [17, 'SHARECROPPER'],
 [18, 'LITTLE GIRL'],
 [19, 'HOT-BLOODED SHOTGUN TOTER'],
 [20, 'MAYOR'],
 [21, 'SOME FARMER'],
 [22, 'SHEPHARD'],
 [23, 'A VOICE'],
 [24, 'VOICE'],
 [25, 'KARL'],
 [26, 'VARIOUS TOWNFOLK'],
 [27, "MAN'S VOICE"],
 [28, 'BEAMEN'],
 [29, 'MILDRED'],
 [30, 'NORTHER WINSLOW'],
 [31, "A GIRL'S VOICE"],
 [32, 'JENNY'],
 [33, 'A DEEP VOICE'],
 [34, 'CASHIER'],
 [35, 'AMOS'],
 [36, "A MAN'S VOICE"],
 [37, 'JUMP LEADER'],
 [38, 'PING'],
 [39, 'JING'],
 [40, 'THE MAN'],
 [41, 'TELLER WOMAN'],
 [42, "WOMAN'S VOICE"],
 [43, 'STUDENT'],
 [44, 'NURSE'],
 [45, 'THE CROWD'],
 [46, 'SON'],
 [47, 'KID']]

We can also plot roughly how many lines each character has (I say "roughly" because the script also counts stage directions given within the dialogue). The Sage commands
sage: chars = character_list(input_ftn_file)
sage: A = [len(character_lines_list(input_ftn_file, name)) for name in chars]
sage: bar_chart(A)

'WILL'

returns the bar chart
From the bar chart, we see that the characters with the most lines are 0 and 2:
sage: chars[0]; chars[2]
'EDWARD'
'WILL'
They are Edward and Will.

Here is how to do this yourself, with little (or no) knowledge of Sage or Python.
  1. Go to cloud.sagemath.com, and create an account.
  2. Create a new project, called "foutain parsing" (or whatever).
  3. Upload fountain-parsing.sage and save to your project.
  4. Upload your script (e.g., august_Big-Fish_script-2003.fountain, from fountain.io) and save to your project.
  5. Open a new worksheet, title it whatever you like, Fountain Parsing for example.
  6. In a cell type
    load("fountain-parsing.sage")
    input_ftn_file = "august_Big-Fish_script-2003.fountain"
    G_bigfish = character_graph_simple(input_ftn_file)
    G_bigfish[0].show(dpi=300, frame=True,figsize=[10,15], layout="spring")
    
    and press shift-enter. This takes about 45 minutes to finish.
  7. If you don't want the character graph, but just a list of characters, enter

    load("fountain-parsing.sage")
    input_ftn_file = "august_Big-Fish_script-2003.fountain"
    chars = character_list(input_ftn_file); chars
    
    in a cell and press shift-enter. It will (immediately) print on the screen the list of characters.
  8. For a numbered list of characters, enter

    load("fountain-parsing.sage")
    input_ftn_file = "august_Big-Fish_script-2003.fountain"
    chars = character_list(input_ftn_file)
    [[i,chars[i]] for i in range(48)]
    
    in a cell and press shift-enter.
  9. If you want the bar graph counting the lines each character has, enter

    load("fountain-parsing.sage")
    input_ftn_file = "august_Big-Fish_script-2003.fountain"
    A = [len(character_lines_list(input_ftn_file, name)) for name in chars]
    bar_chart(A)
    
    in a cell and press shift-enter.