fields¶
Much of the core of gramfuzz lies in the field definitions.
Unlike other data-parsing libraries (such as pfp), gramfuzz only defines the most basic data types to be used for grammar-driven data generation.
Classes¶
Int/UInt/Float/UFloat¶
The gramfuzz.fields.Int, gramfuzz.fields.UInt, gramfuzz.fields.Float,
and gramfuzz.fields.UFloat classes are the only numeric classes defined in gramfuzz.
Most things can be accomplished using these classes, their min/max settings,
and their odds settings.
String¶
The gramfuzz.fields.String class can be used to generate arbitrary-length strings. All of the
settings from the Int class still apply to the String class (min, max,
odds), except that they influence the length of the string, not the characters
it contains.
The String class has another paramter: charset. This is used to specify
which characters should make up the random strings. Several pre-defined charsets
exist within the String class:
charset_alpha_lowercharset_alpha_uppercharset_alphacharset_spacescharset_numcharset_alphanumcharset_all
For example:
s = String(charset="abcdefg", min=2, max=5)
print(s.build())
# 'aca'
And¶
The gramfuzz.fields.And class can be used to concatenate different values
together:
a = And(Int, "hello")
print(a.build())
# '98hello'
And does not take any special parameters.
Or¶
The gramfuzz.fields.Or class can be used to choose randomly between several
different options:
o = Or(Int, "Hello")
for x in range(10):
print(o.build())
# Hello
# Hello
# -91
# -60
# 68
# Hello
# 13
# Hello
# Hello
# Hello
A gramfuzz.fields.WeightedOr class (aliased to gramfuzz.fields.WOr)
also exists to allow weighted probabilities in Or choices:
WeightedOr(
("hello", 0.1), # 10% chance
(UInt, 0.7), # 70% chance
(3.14, 0.3), # 30% chance
)
# or
WOr(
("hello", 0.1), # 10% chance
(UInt, 0.7), # 70% chance
(3.14, 0.3), # 30% chance
)
Join¶
The gramfuzz.fields.Join class can be used to join values together using a
separator. It also has a max value that can be used to indicate how many
times the first value should be repeated (any values other than the first one
will be ignored):
j = And(
"some_function(",
Join(
Or(Int|Q(String)),
sep=", ", max=5),
")"
)
for x in range(10):
print(j.build())
# some_function(-4294967294, "sEaKWSOGabHf", "ZkLXWYAUyEuW", 95, "FHnVYTvB")
# some_function("koBklVcoJbDC", -60)
# some_function(-65537)
# some_function(96)
# some_function(-87, -82, "x", "LKvYXEJHegjMGh")
# some_function("TSMQeGZbXNH")
# some_function(-254, -55, -91, "N")
# some_function(-44, 84, 59, "FBPHBf", "NBZxlVq")
# some_function("BvASDsxrTnycyLBChsM", "p")
# some_function(-85, "X", "HiGdE", "XgJoNBk", 254)
Q¶
The gramfuzz.fields.Q class can be used to surround values in quotes, optionally
specifying one of two string-escaping methods:
print(Q(String).build())
# "znFHLTkwgniAXtNhI"
print(Q("'he\"llo'", escape=True).build())
# '\'he"llo\''"
print(Q("<h1>'hello'</h1>", html_js_escape=True).build())
# '\x3ch1\x3e\'hello\'\x3c/h1\x3e'
escape- use Python’s repr to escape the stringhtml_js_escape- use Python’s"string_escape"(or"unicode_escape"in python3) encoding, as well as replacing<and>with their escaped hex formats
The Q class also accessts a quote keyword argument. This only applies if none
of the escaping methods are specified, and is merely prepended and appended to the
string.
Opt¶
The gramfuzz.fields.Opt inherits from the And class can be used to
wrap values, optionally raising an OptGram exception when built.
If an OptGram exception is raised, the current value being built will be ignored.
j = And(
"some_function(",
Join(
Int,
Q(String),
Opt(Float), # optional argument
sep=", ", max=5),
")"
)
Details¶
Field Base Class¶
All fields inherit from the gramfuzz.fields.Field class.
Field classes may be used in grammar rule definitions, as well as instances
of Field classes (thanks to the gramfuzz.fields.MetaField class and the
gramfuzz.utils.val function).
For example, defining a rule that uses the Int field with the default settings
may look something like this:
Def("integer_rule", Int)
However, if more specific settings for the Int class were desired, it may
look something like this instead:
Def("integer_rule", Int(min=10, max=20))
Note that this can also be abstracted to something like this:
SPECIAL_INT = Int(min=10, max=20)
Def("integer_rule", SPECIAL_INT)
Def("integer_rule2", SPECIAL_INT, ",", SPECIAL_INT)
This pattern is highly recommended and prevents one from constantly hard-coding specific settings throughout a grammar.
Operator Overloading¶
gramfuzz defines two main classes for concatenating or randomly
choosing between values: gramfuzz.fields.And and gramfuzz.fields.Or.
The And and Or classes can be explicitly used:
And(Int, ", ", Int)
Or(Int, Float)
Or they can be used using the overloaded and and or operators:
(Int & ", " & Int)
(Int | Float)
There are a few drawbacks however, mostly having to do with the fact that gramfuzz cannot tell where the parenthesis are in more complex scenarios, like the one below:
(Int & (Int & (Int & Int)))
Ideally, the statement above would generate something like the statement
below, which uses explicit And s:
And(Int, And(Int, And(Int, Int)))
When in reality, gramfuzz ends up doing this instead:
And(Int, Int, Int, Int)
# from the python console:
# >>> a = (Int & (Int & (Int & Int)))
# >>> a
# <And[<Int>,<Int>,<Int>,<Int>]>
For this reason, I tend to use the overloaded operators only in simple situations.
Complex logic/scenarious I tend to only use explicit And and Or.
Gotcha¶
One important gotcha is shown below:
5 | Int | Float
The above example does not work because 5 is the first operand
in the or sequence. This is due to the way Python handles
operator overloading.
However, this will work:
Int | Float | 5
Native types can only be used with the Field overloaded operators
if they are not the first operand.
Odds¶
The gramfuzz.fields.Int, gramfuzz.fields.UInt, gramfuzz.fields.Float,
gramfuzz.fields.UFloat, and gramfuzz.fields.String classes each make use of
the gramfuzz.fields.Field.odds member when generating data, as well as (optionally)
min and max members.
An example of using the odds member can be seen in the default values
for the Int, Float, and String classes:
class Float(Int):
# ...
odds = [
(0.75, [0.0,100.0]), # 75% chance of being in the range [0.0, 100.0)
(0.05, 0), # 5% chance of having the value 0
(0.10, [100.0, 1000.0]), # 10% chance of being in the range [100.0, 1000.0)
(0.10, [1000.0, 100000.0]), # 10% chance of being in the range [1000.0, 100000.0)
]
# ...
It should be noted that the probability percents in each entry in
the odds list should add up to 1.0.
See the documentation below for more details:
gramfuzz.fields.Int.__init__(for min/max)
gramfuzz.fields Reference Documentation¶
This module defines all of the core gramfuzz fields:
Float
UFloat
Int
UInt
Join
Opt
Or
Q
Ref
String
Each field has a build() method, which accepts one argument
(pre) that can be used to assign prerequisites of the build result.
-
class
gramfuzz.fields.And(*values, **kwargs)[source]¶ A
Fieldsubclass that concatenates two values together. This class works nicely withOptvalues.
-
class
gramfuzz.fields.Def(name, *values, **options)[source]¶ The
Defclass is used to define grammar rules. A defined rule has three parts:# Name - A rule name can be declared multiple times. When a rule name with multiple definitions is generated, one of the rule definitions will be chosen at random.
# Values - The values of the rule. These will be concatenated (acts the same as an
And).# Category - Which category to define the rule in. This is an important step and guides the fuzzer into choosing the correct rule definitions when randomly choosing rules to generate.
For example, supposed we defined a grammar for various types of postal addresses. We could have a grammar for US addresses, UK addresses, and Australian addresses. When we want the fuzzer to generate a random address, we would want it to choose one from our US, UK, or Australian address rule and not choose to generate only a zipcode rule.
I often have a main
Xcategory, as well as anX_defcategory. TheXcategory is what I tell to the fuzzer to choose from when randomly generating top-level rules. TheX_defcategory is only used to help build the top-level rules.-
__init__(name, *values, **options)[source]¶ Create a new rule definition. Simply instantiating a new rule definition will add it to the current
GramFuzzerinstance.- Parameters
name (str) – The name of the rule being defined
values (list) – The list of values that define the value of the rule (will be concatenated when built)
cat (str) – The category to create the rule in (default=``”default”``).
no_prune (bool) – If this rule should not be pruned EVEN IF it is found to be unreachable (default=``False``)
-
build(pre=None, shortest=False)[source]¶ Build this rule definition
- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
no_prune= False¶ Whether or not this rule should be pruned if the fuzzer cannot find a way to reach this rule. (default=``False``)
-
-
class
gramfuzz.fields.Field[source]¶ The core class that all field classes are based one. Contains utility methods to determine probabilities/choices/min-max/etc.
-
odds= []¶ oddsis a list of tuples that define probability values.Each item in the list must be a tuple of the form:
(X, Y)
Where
Xis the probability percent, and whereYis one of the following:A single value
A list/tuple containing two values, the min and max of a range of numbers.
Note that the sum of each probability percent in the list must equal 1.0.
-
shortest_is_nothing= False¶ This is used during
gramfuzz.GramFuzzer.find_shortest_paths. Sometimes the fuzzer cannot know based on the values in a field what that field’s minimal behavior will be.Setting this to
Truewill explicitly let theGramFuzzerinstance know what the minimal outcome will be.NOTE when implementing a custom Field subclass and setting
shortest_is_nothingtoTrue, be sure to handle the case whenbuild(shortest=True)is called so that agramfuzz.errors.OptGramerror is raised (which skips the current field from being generated).
-
-
class
gramfuzz.fields.Float(value=None, **kwargs)[source]¶ Defines a float
Fieldwith odds that define float values
-
class
gramfuzz.fields.Int(value=None, **kwargs)[source]¶ Represents all Integers, with predefined odds that target boundary conditions.
-
__init__(value=None, **kwargs)[source]¶ Create a new Int object, optionally specifying a hard-coded value
- Parameters
value (int) – The value of the new int object
min (int) – The minimum value (if value is not specified)
max (int) – The maximum value (if value is not specified)
odds (list) – The probability list. See
Field.oddsfor more information.
-
build(pre=None, shortest=False)[source]¶ Build the integer, optionally providing a
prelist that may be used to define prerequisites for a Field being built.- Parameters
pre (list) – A list of prerequisites to be collected during the building of a Field.
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
-
class
gramfuzz.fields.Join(*values, **kwargs)[source]¶ A
Fieldsubclass that joins other values with a separator. This class works nicely withOptvalues.-
__init__(*values, **kwargs)[source]¶ Create a new instance of the
Joinclass.- Parameters
values (list) – The values to join
sep (str) – The string with which to separate each of the values (default=``”,”``)
max (int) –
The maximum number of times (inclusive) to build the first item in
values. This can be useful when a variable number of items in a list is needed. E.g.:Join(Int, max=5, sep=",")
-
-
class
gramfuzz.fields.MetaField[source]¶ Used as the metaclass of the core
gramfuzz.fields.Fieldclass.MetaFielddefines__and__and__or__and__repr__methods. The overridden and and or operatories allow classes themselves to be wrapped in angramfuzz.fields.Andorgramfuzz.fields.Orwithout having to instantiate them first.E.g. the two lines below are equivalent:
And(Int(), Float()) (Int & Float) Or(Int(), Float()) (Int | Float)
Do note however that this can only be done if the first (farthest to the left) operand is a Field class or instance.
E.g. the first line below will work, but the second line will not will not:
Or(5, Int) 5 | Int
It is also recommended that using the overloaded
&and|operators should only be done in very simple cases, since it is impossible for the code to know the difference between the two statements below:(Int | Float) | Uint Int | Float | UInt
-
class
gramfuzz.fields.Opt(*values, **kwargs)[source]¶ A
Fieldsubclass that randomly chooses to either build the provided values (acts as anAndin that case), or raise anerrors.OptGramexception.When an
errors.OptGramexception is raised, the current value being built is then skipped-
__init__(*values, **kwargs)[source]¶ Create a new
Optinstance- Parameters
values (list) – The list of values to build (or not)
prob (float) – A float value between 0 and 1 that defines the probability
of cancelling the current build.
-
build(pre=None, shortest=False)[source]¶ Build the current
Optinstance- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
prob= 0.5¶ The probability of an
Optinstance raising anerrors.OptGramexception
-
-
class
gramfuzz.fields.Or(*values, **kwargs)[source]¶ A
Fieldsubclass that chooses one of the provided values at random as the result of a call to thebuild()method.
-
class
gramfuzz.fields.PLUS(*values, **kwargs)[source]¶ Acts like the + in a regex - one or more of the values. The values are Anded together one or more times, up to
maxtimes.-
__init__(*values, **kwargs)[source]¶ Create a new instance of the
Joinclass.- Parameters
values (list) – The values to join
sep (str) – The string with which to separate each of the values (default=``”,”``)
max (int) –
The maximum number of times (inclusive) to build the first item in
values. This can be useful when a variable number of items in a list is needed. E.g.:Join(Int, max=5, sep=",")
-
-
class
gramfuzz.fields.Q(*values, **kwargs)[source]¶ A
Fieldsubclass that quotes whatever value is provided.-
__init__(*values, **kwargs)[source]¶ Create the new
Quoteinstance- Parameters
escape (bool) – Whether or not quoted data should be escaped (default=``False``)
html_js_escape (bool) – Whether or not quoted data should be html-javascript escaped (default=``False``)
quote (str) – The quote character to be used if
escapeandhtml_js_escapeareFalse
-
build(pre=None, shortest=False)[source]¶ Build the
Quoteinstance- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
escape= False¶ Whether or not the quoted data should be escaped (default=``False``). Uses
repr(X)
-
html_js_escape= False¶ Whether or not the quoted data should be html-javascript escaped (default=``False``)
-
-
class
gramfuzz.fields.Ref(refname, **kwargs)[source]¶ The
Refclass is used to reference defined rules by their name. If a rule name is defined multiple times, one will be chosen at random.For example, suppose we have a rule that returns an integer:
Def("integer", UInt)
We could define another rule that creates a
Floatby referencing the integer rule twice, and placing a period between them:Def("float", Ref("integer"), ".", Ref("integer"))
-
__init__(refname, **kwargs)[source]¶ Create a new
Refinstance- Parameters
refname (str) – The name of the rule to reference
cat (str) – The name of the category the rule is defined in
-
build(pre=None, shortest=False)[source]¶ Build the
Refinstance by fetching the rule from the GramFuzzer instance and building it- Parameters
pre (list) – The prerequisites list
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
cat= 'default'¶ The default category where the referenced rule definition will be looked for
-
-
class
gramfuzz.fields.STAR(*values, **kwargs)[source]¶ Acts like the
*in a regex - zero or more of the values. The values are Anded together zero or more times, up tomaxtimes.
-
class
gramfuzz.fields.String(value=None, **kwargs)[source]¶ Defines a string field
-
__init__(value=None, **kwargs)[source]¶ Create a new instance of the
Stringfield.- Parameters
value – The hard-coded value of the String field
min (int) – The minimum size of the String when built
max (int) – The maximum size of the String when built
charset (str) – The character-set to be used when building the string
-
build(pre=None, shortest=False)[source]¶ Build the String instance
- Parameters
pre (list) – The prerequisites list (optional, default=None)
shortest (bool) – Whether or not the shortest reference-chain (most minimal) version of the field should be generated.
-
charset= b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'¶ The default character set of the
Stringfield class (default=charset_alpha)
-
charset_all= b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\xc2\x80\xc2\x81\xc2\x82\xc2\x83\xc2\x84\xc2\x85\xc2\x86\xc2\x87\xc2\x88\xc2\x89\xc2\x8a\xc2\x8b\xc2\x8c\xc2\x8d\xc2\x8e\xc2\x8f\xc2\x90\xc2\x91\xc2\x92\xc2\x93\xc2\x94\xc2\x95\xc2\x96\xc2\x97\xc2\x98\xc2\x99\xc2\x9a\xc2\x9b\xc2\x9c\xc2\x9d\xc2\x9e\xc2\x9f\xc2\xa0\xc2\xa1\xc2\xa2\xc2\xa3\xc2\xa4\xc2\xa5\xc2\xa6\xc2\xa7\xc2\xa8\xc2\xa9\xc2\xaa\xc2\xab\xc2\xac\xc2\xad\xc2\xae\xc2\xaf\xc2\xb0\xc2\xb1\xc2\xb2\xc2\xb3\xc2\xb4\xc2\xb5\xc2\xb6\xc2\xb7\xc2\xb8\xc2\xb9\xc2\xba\xc2\xbb\xc2\xbc\xc2\xbd\xc2\xbe\xc2\xbf\xc3\x80\xc3\x81\xc3\x82\xc3\x83\xc3\x84\xc3\x85\xc3\x86\xc3\x87\xc3\x88\xc3\x89\xc3\x8a\xc3\x8b\xc3\x8c\xc3\x8d\xc3\x8e\xc3\x8f\xc3\x90\xc3\x91\xc3\x92\xc3\x93\xc3\x94\xc3\x95\xc3\x96\xc3\x97\xc3\x98\xc3\x99\xc3\x9a\xc3\x9b\xc3\x9c\xc3\x9d\xc3\x9e\xc3\x9f\xc3\xa0\xc3\xa1\xc3\xa2\xc3\xa3\xc3\xa4\xc3\xa5\xc3\xa6\xc3\xa7\xc3\xa8\xc3\xa9\xc3\xaa\xc3\xab\xc3\xac\xc3\xad\xc3\xae\xc3\xaf\xc3\xb0\xc3\xb1\xc3\xb2\xc3\xb3\xc3\xb4\xc3\xb5\xc3\xb6\xc3\xb7\xc3\xb8\xc3\xb9\xc3\xba\xc3\xbb\xc3\xbc\xc3\xbd\xc3\xbe\xc3\xbf'¶ All possible binary characters (
0x0-0xff)
-
charset_alpha= b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'¶ Upper- and lower-case alphabet
-
charset_alpha_lower= b'abcdefghijklmnopqrstuvwxyz'¶ A lower-case alphabet character set
-
charset_alpha_upper= b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'¶ An upper-case alphabet character set
-
charset_alphanum= b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'¶ Alpha-numeric character set (upper- and lower-case alphabet + numbers)
-
charset_num= b'1234567890'¶ Numeric character set
-
charset_spaces= b'\n\r\t '¶ Whitespace character set
-
odds= [(0.85, [0, 20]), (0.1, 1), (0.025, 0), (0.025, [20, 100])]¶ Unlike numeric
Fieldtypes, the odds value for theStringfield defines the length of the field, not characters used in the string.See the
gramfuzz.fields.Field.oddsmember for details on the format of theoddsprobability list.
-
-
gramfuzz.fields.WOr¶ alias of
gramfuzz.fields.WeightedOr
-
class
gramfuzz.fields.WeightedOr(*values, **kwargs)[source]¶ A
Fieldsubclass that chooses one of the provided values at random as the result of a call to thebuild()method. Takes an odds array rather than just direct values. Also aliased toWOr.E.g.
WeightedOr( ("hello", 0.1), # 10% chance (UInt, 0.7), # 70% chance (3.14, 0.3), # 30% chance ) # or WOr( ("hello", 0.1), # 10% chance (UInt, 0.7), # 70% chance (3.14, 0.3), # 30% chance )