row_filter
knime2py.nodes.row_filter
Row Filter module for KNIME to Python conversion.
Overview
This module generates Python code that filters rows of an input DataFrame based on predicates defined in a KNIME settings.xml file. The generated code constructs a boolean mask from the predicates, applies it to the DataFrame, and outputs the filtered result.
Runtime Behavior
Inputs: - Reads a DataFrame from the context using the key format 'src_id:in_port'.
Outputs: - Writes the filtered DataFrame to the context using the key format 'node_id:out_port'.
Key algorithms or mappings: - The module supports various comparison operators (e.g., equality, greater than) and handles column normalization and missing values.
Edge Cases
The code implements safeguards for missing columns, empty predicates, and NaN values, ensuring that the filtering logic remains robust under various input conditions.
Generated Code Dependencies
The generated code requires the following external libraries: - pandas These dependencies are required for the generated code, not for this module itself.
Usage
This module is typically invoked by the knime2py emitter as part of the conversion process from KNIME workflows to Python code. An example of expected context access:
df = context['src_id:in_port'] # input table
Node Identity
KNIME factory ID: - FACTORY = "org.knime.base.node.preproc.filter.row3.RowFilterNodeFactory"
Configuration
The settings are defined in the RowFilterSettings dataclass, which includes:
- match_and: bool (default=True) - Determines if predicates are combined with AND or OR.
- output_mode: str (default="MATCHING") - Specifies whether to output matching or non-matching rows.
- predicates: List[Predicate] - Contains the filtering criteria.
The parse_row_filter_settings function extracts these values from the settings.xml file
using XPath queries.
Limitations
This module does not support all KNIME filtering options and may approximate behavior in certain cases.
References
For more information, refer to the KNIME documentation and the following hub URL: https://hub.knime.com/knime/extensions/org.knime.features.base/latest/ org.knime.base.node.preproc.filter.row3.RowFilterNodeFactory
parse_row_filter_settings(node_dir)
Parse the row filter settings from the settings.xml file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_dir
|
Optional[Path]
|
The directory containing the settings.xml file. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
RowFilterSettings |
RowFilterSettings
|
The parsed row filter settings. |
Source code in src/knime2py/nodes/row_filter.py
142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
generate_imports()
Generate the necessary import statements for the row filter code.
Returns:
| Type | Description |
|---|---|
|
List[str]: A list of import statements. |
Source code in src/knime2py/nodes/row_filter.py
187 188 189 190 191 192 193 194 195 | |
generate_py_body(node_id, node_dir, in_ports, out_ports=None)
Generate the body of the Python code for the row filter node.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_id
|
str
|
The ID of the node. |
required |
node_dir
|
Optional[str]
|
The directory of the node. |
required |
in_ports
|
List[object]
|
The list of incoming ports. |
required |
out_ports
|
Optional[List[str]]
|
The list of outgoing ports. |
None
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of lines of code that make up the body of the node. |
Source code in src/knime2py/nodes/row_filter.py
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 | |
get_name()
Return name of the node in KNIME workflow.
Source code in src/knime2py/nodes/row_filter.py
437 438 439 | |
handle(ntype, nid, npath, incoming, outgoing)
Handle the processing of the row filter node.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ntype
|
The type of the node. |
required | |
nid
|
The ID of the node. |
required | |
npath
|
The path to the node. |
required | |
incoming
|
The incoming connections. |
required | |
outgoing
|
The outgoing connections. |
required |
Returns:
| Type | Description |
|---|---|
|
Tuple[List[str], List[str]]: A tuple containing the imports and body lines. |
Source code in src/knime2py/nodes/row_filter.py
442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 | |