Skip to content

Build a PDFA

PDFA

In this notebook, we will see how to use the PDFA class.

Example

Utility functions to display SVGs.

%matplotlib inline

import numpy as np

from pdfa_learning.pdfa import PDFA
from helpers import render_automaton

The following automaton captures all the sequences of only heads, followed by one tail.

def make_automaton(p: float = 0.5) -> PDFA:
    """
    Make the PDFA for the heads and tail example.

    :param p: the probability of getting head.
    :return: the PDFA.
    """
    return PDFA(
        nb_states=2,
        alphabet_size=2,
        transition_dict={
            0: {
                0: (0, p),
                1: (1, 1 - p),
            },
            1: {
                -1: (-1, 1.0)
            }
        }
    )

automaton = make_automaton(0.5)
render_automaton(automaton)
%3 fake 0 0 fake->0 0->0 0, 0.5 1 1 0->1 1, 0.5 -1 -1 1->-1 -1, 1.0

Sample a word from the PDFA above.

automaton.sample()
[1, -1]

With probability p = 0.5 the trace stops. Hence, The average length of the trace is:

1 + \sum\limits_{n=1}^{\infty} n\cdot p^{n-1}p = 1 + \frac{1}{p}

Which for p=\frac{1}{2}, it is 3 (+1 is used to count the termination symbol).

ps = [0.1, 0.5, 0.9]
expected_length = lambda x: 1 + 1 / x

nb_samples = 10000
for p in ps:
    stop_probability = 1 - p
    _automaton = make_automaton(stop_probability)
    samples = [_automaton.sample() for _ in range(nb_samples)]
    average_length = np.mean([len(l) for l in samples])
    print(f"The average length of the samples is: {average_length:5.2f}. Expected: {expected_length(p):5.2f}.")
The average length of the samples is: 10.80. Expected: 11.00.
The average length of the samples is:  3.01. Expected:  3.00.
The average length of the samples is:  2.11. Expected:  2.11.