kodiak package¶
Submodules¶
kodiak.args_dict_builder module¶
kodiak.args_parser module¶
-
class
kodiak.args_parser.
Match
(original, label=None, value=None, payload=None)[source]¶ Bases:
object
An object generated after the process of matching and passed to the colbuilder
-
original
¶ the unmodified matched string
Type: str
-
value
¶ a possible derived string from original
Type: str
-
label
¶ used as the name or title of the Match
Type: str
-
payload
¶ a dict with extra information as default_colbuilder that can be used by the colbuilder in kodiak_dataframe.gencol
Type: dict
-
kodiak.colbuilders module¶
Helper methods to use as colbuilders with colgen and mutcol
-
kodiak.colbuilders.
splitter
(pattern=None)[source]¶ A builder function that returns a colbuilder
Parameters: pattern – a string pattern used to split a string Example
>>> from kodiak.kodiak_dataframe import KodiakDataFrame >>> from kodiak.colbuilders import splitter >>> df = KodiakDataFrame({'name': ['Groucho Marx', 'Harpo Marx']}) >>> df.gencol('{first,last}_name', 'name', splitter(" "))
Will return the following data frame:
>>> # name first_name last_name >>> # 0 Groucho Marx Groucho Marx >>> # 1 Harpo Marx Harpo Marx
Returns: A function used as a colbuilder
kodiak.config module¶
-
kodiak.config.
base_config
(parser=None, match_transform=None, new_col_combiner=None, unpack=None, drop=None, col_pair_combiner=None)[source]¶ Default config used by gencol and mutcol
Parameters: - parser – Kodiak by default uses ArgsParser to parse newcols
- match_transform – data passed to the colbuilder could be transformed first, by default we use the default_transform pipeline, you could replace it with an array of Transforms objects.
- new_col_combiner – params present in the newcols template provide arguments to the colbuilder you can combine arguments in different groups in different ways, ie: “foo_{a,b}_{c,d}” has two groups: [‘a’,’b’] and [‘c’, ‘d’] by default we use zip but you could replace it with a function with equal signature.
- unpack (bool) – True by default. The arguments passed to the colbuilder is of type Match in certain occasions you can pass strings
- drop (bool) – False by default. Set to True if you want to drop the column col in gencol after the new columns are created
- col_pair_combiner – Once you have the arguments from the newcol template string they’re combined with the data extracted from the col. This option controls the way this two elements are combined. Currently we use product from itertools, any replacement must fulfill the same signature.
Returns: dict with base config options
-
kodiak.config.
cfg
(parser=None, match_transform=None, new_col_combiner=None, unpack=None, drop=None, col_pair_combiner=None)¶ Default config used by gencol and mutcol
Parameters: - parser – Kodiak by default uses ArgsParser to parse newcols
- match_transform – data passed to the colbuilder could be transformed first, by default we use the default_transform pipeline, you could replace it with an array of Transforms objects.
- new_col_combiner – params present in the newcols template provide arguments to the colbuilder you can combine arguments in different groups in different ways, ie: “foo_{a,b}_{c,d}” has two groups: [‘a’,’b’] and [‘c’, ‘d’] by default we use zip but you could replace it with a function with equal signature.
- unpack (bool) – True by default. The arguments passed to the colbuilder is of type Match in certain occasions you can pass strings
- drop (bool) – False by default. Set to True if you want to drop the column col in gencol after the new columns are created
- col_pair_combiner – Once you have the arguments from the newcol template string they’re combined with the data extracted from the col. This option controls the way this two elements are combined. Currently we use product from itertools, any replacement must fulfill the same signature.
Returns: dict with base config options
-
kodiak.config.
restore_default_config
(*keys)[source]¶ Restore original configuration on all or specific properties
If no key is present the whole configuration will be restored, if keys are present only them will be restored
Parameters: keys – a list of strings that correspond to options Returns: Nothing Raises: KeyError if key is not a valid option
kodiak.kodiak_dataframe module¶
-
class
kodiak.kodiak_dataframe.
KodiakDataFrame
(*args, **kwargs)[source]¶ Bases:
pandas.core.frame.DataFrame
A KodiakDataFrame is a pandas.DataFrame that has new capabilities to ease your workflow:
gencol
andmutcol
Example
>>> from kodiak import KodiakDataFrame >>> kdf = KodiakDataFrame({'country': ['ar','br','cl','co']})
-
gencol
(newcols, col, colbuilder=None, drop=None, enum=False, config=None)[source]¶ Generate new columns following the newcols pattern based on col
Parameters: - newcols (str) – new column/s template string
- col (str) – column name from where data is taken
- colbuilder –
a function to build the new columns, could be omitted if it can be deduced from newcols. Usually the signature of the function has two arguments x, y, x is the data extracted from col and y is the argument extracted from the
newcols
template.Example
If newcol is
"born_{month,day,year}"
,col
isborn
and an instance ofborn
is the date1980-12-24
, then in different instancesx
,y
would be('1980-12-24', 'month')
('1980-12-24','day')
('1980-12-24','year')
- drop (bool) – True if you want to drop the column col
- enum (bool) – False by default. If true, it expects that the signature
of the
colbuilder
has three arguments:index
,x
andy
- config – custom configuration build with base_config
Raises: ValueError
-
kodiak.transforms module¶
-
class
kodiak.transforms.
MethodTransform
[source]¶ Bases:
object
-
transform
(match)[source]¶ Adds to the Match object payload the default_colbuilder: colbuilders.as_method
Parameters: match (Match) – The Match object that is going to be enriched. Returns: The enriched Match object with a default_colbuilder key in the payload Return type: Match Raises: ValueError
– in case the Match value attribute is ambiguous.
-
-
class
kodiak.transforms.
PropertyTransform
[source]¶ Bases:
object
-
transform
(match)[source]¶ Adds to the Match object payload the default_colbuilder: colbuilders.as_attribute
Parameters: match (Match) – The Match object that is going to be enriched. Returns: The enriched Match object with a default_colbuilder key in the payload Return type: Match Raises: ValueError
– in case the Match value attribute is ambiguous.
-
Module contents¶
Top-level package for kodiak.