fields¶
Much of the core of gramfuzz lies in the field definitions.
Unlike other data-parsing libraries (such as pfp), gramfuzz only defines the most basic data types to be used for grammar-driven data generation.
Classes¶
Int/UInt/Float/UFloat¶
The gramfuzz.fields.Int
, gramfuzz.fields.UInt
, gramfuzz.fields.Float
,
and gramfuzz.fields.UFloat
classes are the only numeric classes defined in gramfuzz.
Most things can be accomplished using these classes, their min
/max
settings,
and their odds
settings.
String¶
The gramfuzz.fields.String
class can be used to generate arbitrary-length strings. All of the
settings from the Int
class still apply to the String
class (min
, max
,
odds
), except that they influence the length of the string, not the characters
it contains.
The String
class has another paramter: charset
. This is used to specify
which characters should make up the random strings. Several pre-defined charsets
exist within the String
class:
charset_alpha_lower
charset_alpha_upper
charset_alpha
charset_spaces
charset_num
charset_alphanum
charset_all
For example:
s = String(charset="abcdefg", min=2, max=5)
print(s.build())
# 'aca'
And¶
The gramfuzz.fields.And
class can be used to concatenate different values
together:
a = And(Int, "hello")
print(a.build())
# '98hello'
And
does not take any special parameters.
Or¶
The gramfuzz.fields.Or
class can be used to choose randomly between several
different options:
o = Or(Int, "Hello")
for x in range(10):
print(o.build())
# Hello
# Hello
# -91
# -60
# 68
# Hello
# 13
# Hello
# Hello
# Hello
A gramfuzz.fields.WeightedOr
class (aliased to gramfuzz.fields.WOr
)
also exists to allow weighted probabilities in Or choices:
WeightedOr(
("hello", 0.1), # 10% chance
(UInt, 0.7), # 70% chance
(3.14, 0.3), # 30% chance
)
# or
WOr(
("hello", 0.1), # 10% chance
(UInt, 0.7), # 70% chance
(3.14, 0.3), # 30% chance
)
Join¶
The gramfuzz.fields.Join
class can be used to join values together using a
separator. It also has a max
value that can be used to indicate how many
times the first value should be repeated (any values other than the first one
will be ignored):
j = And(
"some_function(",
Join(
Or(Int|Q(String)),
sep=", ", max=5),
")"
)
for x in range(10):
print(j.build())
# some_function(-4294967294, "sEaKWSOGabHf", "ZkLXWYAUyEuW", 95, "FHnVYTvB")
# some_function("koBklVcoJbDC", -60)
# some_function(-65537)
# some_function(96)
# some_function(-87, -82, "x", "LKvYXEJHegjMGh")
# some_function("TSMQeGZbXNH")
# some_function(-254, -55, -91, "N")
# some_function(-44, 84, 59, "FBPHBf", "NBZxlVq")
# some_function("BvASDsxrTnycyLBChsM", "p")
# some_function(-85, "X", "HiGdE", "XgJoNBk", 254)
Q¶
The gramfuzz.fields.Q
class can be used to surround values in quotes, optionally
specifying one of two string-escaping methods:
print(Q(String).build())
# "znFHLTkwgniAXtNhI"
print(Q("'he\"llo'", escape=True).build())
# '\'he"llo\''"
print(Q("<h1>'hello'</h1>", html_js_escape=True).build())
# '\x3ch1\x3e\'hello\'\x3c/h1\x3e'
escape
- use Python’s repr to escape the stringhtml_js_escape
- use Python’s"string_escape"
(or"unicode_escape"
in python3) encoding, as well as replacing<
and>
with their escaped hex formats
The Q
class also accessts a quote
keyword argument. This only applies if none
of the escaping methods are specified, and is merely prepended and appended to the
string.
Opt¶
The gramfuzz.fields.Opt
inherits from the And
class can be used to
wrap values, optionally raising an OptGram
exception when built.
If an OptGram
exception is raised, the current value being built will be ignored.
j = And(
"some_function(",
Join(
Int,
Q(String),
Opt(Float), # optional argument
sep=", ", max=5),
")"
)
Details¶
Field Base Class¶
All fields inherit from the gramfuzz.fields.Field
class.
Field
classes may be used in grammar rule definitions, as well as instances
of Field
classes (thanks to the gramfuzz.fields.MetaField
class and the
gramfuzz.utils.val
function).
For example, defining a rule that uses the Int
field with the default settings
may look something like this:
Def("integer_rule", Int)
However, if more specific settings for the Int
class were desired, it may
look something like this instead:
Def("integer_rule", Int(min=10, max=20))
Note that this can also be abstracted to something like this:
SPECIAL_INT = Int(min=10, max=20)
Def("integer_rule", SPECIAL_INT)
Def("integer_rule2", SPECIAL_INT, ",", SPECIAL_INT)
This pattern is highly recommended and prevents one from constantly hard-coding specific settings throughout a grammar.
Operator Overloading¶
gramfuzz defines two main classes for concatenating or randomly
choosing between values: gramfuzz.fields.And
and gramfuzz.fields.Or
.
The And
and Or
classes can be explicitly used:
And(Int, ", ", Int)
Or(Int, Float)
Or they can be used using the overloaded and and or operators:
(Int & ", " & Int)
(Int | Float)
There are a few drawbacks however, mostly having to do with the fact that gramfuzz cannot tell where the parenthesis are in more complex scenarios, like the one below:
(Int & (Int & (Int & Int)))
Ideally, the statement above would generate something like the statement
below, which uses explicit And
s:
And(Int, And(Int, And(Int, Int)))
When in reality, gramfuzz ends up doing this instead:
And(Int, Int, Int, Int)
# from the python console:
# >>> a = (Int & (Int & (Int & Int)))
# >>> a
# <And[<Int>,<Int>,<Int>,<Int>]>
For this reason, I tend to use the overloaded operators only in simple situations.
Complex logic/scenarious I tend to only use explicit And
and Or
.
Gotcha¶
One important gotcha is shown below:
5 | Int | Float
The above example does not work because 5
is the first operand
in the or sequence. This is due to the way Python handles
operator overloading.
However, this will work:
Int | Float | 5
Native types can only be used with the Field
overloaded operators
if they are not the first operand.
Odds¶
The gramfuzz.fields.Int
, gramfuzz.fields.UInt
, gramfuzz.fields.Float
,
gramfuzz.fields.UFloat
, and gramfuzz.fields.String
classes each make use of
the gramfuzz.fields.Field.odds
member when generating data, as well as (optionally)
min
and max
members.
An example of using the odds
member can be seen in the default values
for the Int
, Float
, and String
classes:
class Float(Int):
# ...
odds = [
(0.75, [0.0,100.0]), # 75% chance of being in the range [0.0, 100.0)
(0.05, 0), # 5% chance of having the value 0
(0.10, [100.0, 1000.0]), # 10% chance of being in the range [100.0, 1000.0)
(0.10, [1000.0, 100000.0]), # 10% chance of being in the range [1000.0, 100000.0)
]
# ...
It should be noted that the probability percents in each entry in
the odds
list should add up to 1.0
.
See the documentation below for more details:
gramfuzz.fields.Int.__init__
(for min/max)
gramfuzz.fields Reference Documentation¶
This module defines all of the core gramfuzz fields:
Float
UFloat
Int
UInt
Join
Opt
Or
Q
Ref
String
Each field has a build()
method, which accepts one argument
(pre
) that can be used to assign prerequisites of the build result.
-
class
gramfuzz.fields.
And
(*values, **kwargs)[source]¶ A
Field
subclass that concatenates two values together. This class works nicely withOpt
values.
-
class
gramfuzz.fields.
Def
(name, *values, **options)[source]¶ The
Def
class is used to define grammar rules. A defined rule has three parts:# Name - A rule name can be declared multiple times. When a rule name with multiple definitions is generated, one of the rule definitions will be chosen at random.
# Values - The values of the rule. These will be concatenated (acts the same as an
And
).# Category - Which category to define the rule in. This is an important step and guides the fuzzer into choosing the correct rule definitions when randomly choosing rules to generate.
For example, supposed we defined a grammar for various types of postal addresses. We could have a grammar for US addresses, UK addresses, and Australian addresses. When we want the fuzzer to generate a random address, we would want it to choose one from our US, UK, or Australian address rule and not choose to generate only a zipcode rule.
I often have a main
X
category, as well as anX_def
category. TheX
category is what I tell to the fuzzer to choose from when randomly generating top-level rules. TheX_def
category is only used to help build the top-level rules.-
__init__
(name, *values, **options)[source]¶ Create a new rule definition. Simply instantiating a new rule definition will add it to the current
GramFuzzer
instance.- Parameters
name (str) – The name of the rule being defined
values (list) – The list of values that define the value of the rule (will be concatenated when built)
cat (str) – The category to create the rule in (default=``”default”``).
no_prune (bool) – If this rule should not be pruned EVEN IF it is found to be unreachable (default=``False``)
-
build
(pre=None, shortest=False)[source]¶ Build this rule definition
- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
no_prune
= False¶ Whether or not this rule should be pruned if the fuzzer cannot find a way to reach this rule. (default=``False``)
-
-
class
gramfuzz.fields.
Field
[source]¶ The core class that all field classes are based one. Contains utility methods to determine probabilities/choices/min-max/etc.
-
odds
= []¶ odds
is a list of tuples that define probability values.Each item in the list must be a tuple of the form:
(X, Y)
Where
X
is the probability percent, and whereY
is one of the following:A single value
A list/tuple containing two values, the min and max of a range of numbers.
Note that the sum of each probability percent in the list must equal 1.0.
-
shortest_is_nothing
= False¶ This is used during
gramfuzz.GramFuzzer.find_shortest_paths
. Sometimes the fuzzer cannot know based on the values in a field what that field’s minimal behavior will be.Setting this to
True
will explicitly let theGramFuzzer
instance know what the minimal outcome will be.NOTE when implementing a custom Field subclass and setting
shortest_is_nothing
toTrue
, be sure to handle the case whenbuild(shortest=True)
is called so that agramfuzz.errors.OptGram
error is raised (which skips the current field from being generated).
-
-
class
gramfuzz.fields.
Float
(value=None, **kwargs)[source]¶ Defines a float
Field
with odds that define float values
-
class
gramfuzz.fields.
Int
(value=None, **kwargs)[source]¶ Represents all Integers, with predefined odds that target boundary conditions.
-
__init__
(value=None, **kwargs)[source]¶ Create a new Int object, optionally specifying a hard-coded value
- Parameters
value (int) – The value of the new int object
min (int) – The minimum value (if value is not specified)
max (int) – The maximum value (if value is not specified)
odds (list) – The probability list. See
Field.odds
for more information.
-
build
(pre=None, shortest=False)[source]¶ Build the integer, optionally providing a
pre
list that may be used to define prerequisites for a Field being built.- Parameters
pre (list) – A list of prerequisites to be collected during the building of a Field.
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
-
class
gramfuzz.fields.
Join
(*values, **kwargs)[source]¶ A
Field
subclass that joins other values with a separator. This class works nicely withOpt
values.-
__init__
(*values, **kwargs)[source]¶ Create a new instance of the
Join
class.- Parameters
values (list) – The values to join
sep (str) – The string with which to separate each of the values (default=``”,”``)
max (int) –
The maximum number of times (inclusive) to build the first item in
values
. This can be useful when a variable number of items in a list is needed. E.g.:Join(Int, max=5, sep=",")
-
-
class
gramfuzz.fields.
MetaField
[source]¶ Used as the metaclass of the core
gramfuzz.fields.Field
class.MetaField
defines__and__
and__or__
and__repr__
methods. The overridden and and or operatories allow classes themselves to be wrapped in angramfuzz.fields.And
orgramfuzz.fields.Or
without having to instantiate them first.E.g. the two lines below are equivalent:
And(Int(), Float()) (Int & Float) Or(Int(), Float()) (Int | Float)
Do note however that this can only be done if the first (farthest to the left) operand is a Field class or instance.
E.g. the first line below will work, but the second line will not will not:
Or(5, Int) 5 | Int
It is also recommended that using the overloaded
&
and|
operators should only be done in very simple cases, since it is impossible for the code to know the difference between the two statements below:(Int | Float) | Uint Int | Float | UInt
-
class
gramfuzz.fields.
Opt
(*values, **kwargs)[source]¶ A
Field
subclass that randomly chooses to either build the provided values (acts as anAnd
in that case), or raise anerrors.OptGram
exception.When an
errors.OptGram
exception is raised, the current value being built is then skipped-
__init__
(*values, **kwargs)[source]¶ Create a new
Opt
instance- Parameters
values (list) – The list of values to build (or not)
prob (float) – A float value between 0 and 1 that defines the probability
of cancelling the current build.
-
build
(pre=None, shortest=False)[source]¶ Build the current
Opt
instance- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
prob
= 0.5¶ The probability of an
Opt
instance raising anerrors.OptGram
exception
-
-
class
gramfuzz.fields.
Or
(*values, **kwargs)[source]¶ A
Field
subclass that chooses one of the provided values at random as the result of a call to thebuild()
method.
-
class
gramfuzz.fields.
PLUS
(*values, **kwargs)[source]¶ Acts like the + in a regex - one or more of the values. The values are Anded together one or more times, up to
max
times.-
__init__
(*values, **kwargs)[source]¶ Create a new instance of the
Join
class.- Parameters
values (list) – The values to join
sep (str) – The string with which to separate each of the values (default=``”,”``)
max (int) –
The maximum number of times (inclusive) to build the first item in
values
. This can be useful when a variable number of items in a list is needed. E.g.:Join(Int, max=5, sep=",")
-
-
class
gramfuzz.fields.
Q
(*values, **kwargs)[source]¶ A
Field
subclass that quotes whatever value is provided.-
__init__
(*values, **kwargs)[source]¶ Create the new
Quote
instance- Parameters
escape (bool) – Whether or not quoted data should be escaped (default=``False``)
html_js_escape (bool) – Whether or not quoted data should be html-javascript escaped (default=``False``)
quote (str) – The quote character to be used if
escape
andhtml_js_escape
areFalse
-
build
(pre=None, shortest=False)[source]¶ Build the
Quote
instance- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
escape
= False¶ Whether or not the quoted data should be escaped (default=``False``). Uses
repr(X)
-
html_js_escape
= False¶ Whether or not the quoted data should be html-javascript escaped (default=``False``)
-
-
class
gramfuzz.fields.
Ref
(refname, **kwargs)[source]¶ The
Ref
class is used to reference defined rules by their name. If a rule name is defined multiple times, one will be chosen at random.For example, suppose we have a rule that returns an integer:
Def("integer", UInt)
We could define another rule that creates a
Float
by referencing the integer rule twice, and placing a period between them:Def("float", Ref("integer"), ".", Ref("integer"))
-
__init__
(refname, **kwargs)[source]¶ Create a new
Ref
instance- Parameters
refname (str) – The name of the rule to reference
cat (str) – The name of the category the rule is defined in
-
build
(pre=None, shortest=False)[source]¶ Build the
Ref
instance by fetching the rule from the GramFuzzer instance and building it- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
cat
= 'default'¶ The default category where the referenced rule definition will be looked for
-
-
class
gramfuzz.fields.
STAR
(*values, **kwargs)[source]¶ Acts like the
*
in a regex - zero or more of the values. The values are Anded together zero or more times, up tomax
times.
-
class
gramfuzz.fields.
String
(value=None, **kwargs)[source]¶ Defines a string field
-
__init__
(value=None, **kwargs)[source]¶ Create a new instance of the
String
field.- Parameters
value – The hard-coded value of the String field
min (int) – The minimum size of the String when built
max (int) – The maximum size of the String when built
charset (str) – The character-set to be used when building the string
-
build
(pre=None, shortest=False)[source]¶ Build the String instance
- Parameters
pre (list) – The prerequisites list (optional, default=None)
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
charset
= b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'¶ The default character set of the
String
field class (default=charset_alpha)
-
charset_all
= b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\xc2\x80\xc2\x81\xc2\x82\xc2\x83\xc2\x84\xc2\x85\xc2\x86\xc2\x87\xc2\x88\xc2\x89\xc2\x8a\xc2\x8b\xc2\x8c\xc2\x8d\xc2\x8e\xc2\x8f\xc2\x90\xc2\x91\xc2\x92\xc2\x93\xc2\x94\xc2\x95\xc2\x96\xc2\x97\xc2\x98\xc2\x99\xc2\x9a\xc2\x9b\xc2\x9c\xc2\x9d\xc2\x9e\xc2\x9f\xc2\xa0\xc2\xa1\xc2\xa2\xc2\xa3\xc2\xa4\xc2\xa5\xc2\xa6\xc2\xa7\xc2\xa8\xc2\xa9\xc2\xaa\xc2\xab\xc2\xac\xc2\xad\xc2\xae\xc2\xaf\xc2\xb0\xc2\xb1\xc2\xb2\xc2\xb3\xc2\xb4\xc2\xb5\xc2\xb6\xc2\xb7\xc2\xb8\xc2\xb9\xc2\xba\xc2\xbb\xc2\xbc\xc2\xbd\xc2\xbe\xc2\xbf\xc3\x80\xc3\x81\xc3\x82\xc3\x83\xc3\x84\xc3\x85\xc3\x86\xc3\x87\xc3\x88\xc3\x89\xc3\x8a\xc3\x8b\xc3\x8c\xc3\x8d\xc3\x8e\xc3\x8f\xc3\x90\xc3\x91\xc3\x92\xc3\x93\xc3\x94\xc3\x95\xc3\x96\xc3\x97\xc3\x98\xc3\x99\xc3\x9a\xc3\x9b\xc3\x9c\xc3\x9d\xc3\x9e\xc3\x9f\xc3\xa0\xc3\xa1\xc3\xa2\xc3\xa3\xc3\xa4\xc3\xa5\xc3\xa6\xc3\xa7\xc3\xa8\xc3\xa9\xc3\xaa\xc3\xab\xc3\xac\xc3\xad\xc3\xae\xc3\xaf\xc3\xb0\xc3\xb1\xc3\xb2\xc3\xb3\xc3\xb4\xc3\xb5\xc3\xb6\xc3\xb7\xc3\xb8\xc3\xb9\xc3\xba\xc3\xbb\xc3\xbc\xc3\xbd\xc3\xbe\xc3\xbf'¶ All possible binary characters (
0x0-0xff
)
-
charset_alpha
= b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'¶ Upper- and lower-case alphabet
-
charset_alpha_lower
= b'abcdefghijklmnopqrstuvwxyz'¶ A lower-case alphabet character set
-
charset_alpha_upper
= b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'¶ An upper-case alphabet character set
-
charset_alphanum
= b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'¶ Alpha-numeric character set (upper- and lower-case alphabet + numbers)
-
charset_num
= b'1234567890'¶ Numeric character set
-
charset_spaces
= b'\n\r\t '¶ Whitespace character set
-
odds
= [(0.85, [0, 20]), (0.1, 1), (0.025, 0), (0.025, [20, 100])]¶ Unlike numeric
Field
types, the odds value for theString
field defines the length of the field, not characters used in the string.See the
gramfuzz.fields.Field.odds
member for details on the format of theodds
probability list.
-
-
gramfuzz.fields.
WOr
¶ alias of
gramfuzz.fields.WeightedOr
-
class
gramfuzz.fields.
WeightedOr
(*values, **kwargs)[source]¶ A
Field
subclass that chooses one of the provided values at random as the result of a call to thebuild()
method. Takes an odds array rather than just direct values. Also aliased toWOr
.E.g.
WeightedOr( ("hello", 0.1), # 10% chance (UInt, 0.7), # 70% chance (3.14, 0.3), # 30% chance ) # or WOr( ("hello", 0.1), # 10% chance (UInt, 0.7), # 70% chance (3.14, 0.3), # 30% chance )