Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClickHouse Adapter - Nullable(Bool) seems to break column parsing #123

Open
babaMar opened this issue Jan 2, 2025 · 4 comments
Open

ClickHouse Adapter - Nullable(Bool) seems to break column parsing #123

babaMar opened this issue Jan 2, 2025 · 4 comments
Assignees

Comments

@babaMar
Copy link
Contributor

babaMar commented Jan 2, 2025

Testing the ClickHouse adapter I came across a crash, which seems originating from a field having type: Nullable(Bool).

Here's the stack trace:

odd-collector-1          | 2025-01-02 13:10:15.898 | DEBUG    | odd_collector_sdk.job:log_execution:24 - Traceback (most recent call last):                                                                                                                                                                                  
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 22, in log_execution                                                                                                                                                                                               
odd-collector-1          |     yield                                                                                                                                                                                                          
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 107, in start                                                                                                                       
odd-collector-1          |     for del_ in self._get_data_entity_list():                                                                                                                                                                      
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 111, in _get_data_entity_list                                                                                                       
odd-collector-1          |     data_entity_lists = self._adapter.get_data_entity_list()                                                                                                                                                       
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/adapter.py", line 39, in get_data_entity_list                                                                                                                       
odd-collector-1          |     items=self.get_data_entities(),                                                                                                                                                                                
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/adapter.py", line 28, in get_data_entities                                                                                                                          
odd-collector-1          |     return map_table(                                                                                                                                                                                              
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/mappers/tables.py", line 73, in map_table                                                                                                                           
odd-collector-1          |     column_data_fields = build_dataset_fields(                                                                                                                                                                     
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/mappers/columns.py", line 142, in build_dataset_fields                                                                                                              
odd-collector-1          |     type_tree = parser.parse(column.type)                                                                                                                                                                          
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/lark.py", line 581, in parse                                                                                                                                   
odd-collector-1          |     return self.parser.parse(text, start=start, on_error=on_error)                                                                                                                                                 
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parser_frontends.py", line 106, in parse 
odd-collector-1          |     return self.parser.parse(stream, chosen_start, **kw)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/earley.py", line 297, in parse
odd-collector-1          |     to_scan = self._parse(lexer, columns, to_scan, start_symbol)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/xearley.py", line 144, in _parse 
odd-collector-1          |     to_scan = scan(i, to_scan)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/xearley.py", line 118, in scan
odd-collector-1          |     raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},
odd-collector-1          | lark.exceptions.UnexpectedCharacters: No terminal matches '(' in the current parser context, at line 1 col 9
odd-collector-1          | 
odd-collector-1          | Nullable(Bool)
odd-collector-1          |         ^                                
odd-collector-1          |                                          
odd-collector-1          |                                          
odd-collector-1          | 2025-01-02 13:10:15.898 | ERROR    | odd_collector_sdk.job:log_execution:25 - [clickhouse_adapter] failed.                                                                                                                                           
odd-collector-1          |  No terminal matches '(' in the current parser context, at line 1 col 9                                      
odd-collector-1          |                                          
odd-collector-1          | Nullable(Bool)                                      
odd-collector-1          |         ^       
@babaMar
Copy link
Contributor Author

babaMar commented Jan 22, 2025

Encountered another similar crash:

odd-collector-1          | 2025-01-22 15:29:00.454 | DEBUG    | odd_collector_sdk.job:log_execution:24 - Traceback (most recent call last):                                                                                                                                                                                  
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 22, in log_execution                                                                                                                                                                                               
odd-collector-1          |     yield                                                                                                                                                                                                                                                                                         
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 107, in start                                                                                                                                                                                                      
odd-collector-1          |     for del_ in self._get_data_entity_list():                                                                                                                                                                      
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 111, in _get_data_entity_list                                                                                                                                         
odd-collector-1          |     data_entity_lists = self._adapter.get_data_entity_list()                             
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/adapter.py", line 39, in get_data_entity_list                                                                                                                                                                                                                                                                      
odd-collector-1          |     items=self.get_data_entities(),                                                         
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/adapter.py", line 28, in get_data_entities                                                                                                                                                            
odd-collector-1          |     return map_table(                                                                                                                                                                                              
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/mappers/tables.py", line 73, in map_table                                                                                                                                                                                                          
odd-collector-1          |     column_data_fields = build_dataset_fields(                                  
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/mappers/columns.py", line 142, in build_dataset_fields                                                                                                                                                
odd-collector-1          |     type_tree = parser.parse(column.type)                                                                                                                                                                          
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/lark.py", line 581, in parse                                                                                                                                                                                                                  
odd-collector-1          |     return self.parser.parse(text, start=start, on_error=on_error)                                                                                                                                                                                   
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parser_frontends.py", line 106, in parse                                                                                                                                                         
odd-collector-1          |     return self.parser.parse(stream, chosen_start, **kw)                                                                                                                                                           
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/earley.py", line 297, in parse                                                                                                                                                                                                        
odd-collector-1          |     to_scan = self._parse(lexer, columns, to_scan, start_symbol)                            
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/xearley.py", line 144, in _parse                                                                                                                       
odd-collector-1          |     to_scan = scan(i, to_scan)                                                                                                                                                                                     
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/xearley.py", line 118, in scan                                                                                                                                                                                                        
odd-collector-1          |     raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},                                                                                                                                                                                                                                                         
odd-collector-1          | lark.exceptions.UnexpectedCharacters: No terminal matches ')' in the current parser context, at line 1 col 13                                                                                                                                        
odd-collector-1          |                                                                                                              
odd-collector-1          | DateTime64(6)    
odd-collector-1          |             ^
odd-collector-1          | Expected one of: 
odd-collector-1          |      * COMMA                             
odd-collector-1          |                                                                                                                                                                                                                                                                            

@ValeriyWorld
Copy link
Contributor

I did some research and understood what is the root problem. In our current realisation we are using lark for parsing Clickhouse column types, and looks like there are some bugs/inconsistency in .lark file. Though looks like it is a tricky problem, some of my attempts to add new rules of parsing encountered the errors. Thank you for reporting various cases where it is crashing.

@ValeriyWorld
Copy link
Contributor

ValeriyWorld commented Jan 29, 2025

@babaMar I made a new release with the fixes for this 2 cases. You can try it out. Thank you for reporting this bugs.

@babaMar
Copy link
Contributor Author

babaMar commented Jan 30, 2025

@ValeriyWorld just tried, those got fixed, thanks!

Unfortunately it looks like there are other data types causing a similar crash:

odd-collector-1          | 2025-01-30 09:27:19.077 | DEBUG    | odd_collector.adapters.clickhouse.mappers.columns:_build_dataset_fields:125 - Column created_timestamp has ODD type Type.TYPE_DATETIME and logical type DateTime64(6)
odd-collector-1          | 2025-01-30 09:27:19.082 | DEBUG    | odd_collector_sdk.job:log_execution:24 - Traceback (most recent call last):
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 22, in log_execution
odd-collector-1          |     yield
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 107, in start
odd-collector-1          |     for del_ in self._get_data_entity_list():
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/odd_collector_sdk/job.py", line 111, in _get_data_entity_list
odd-collector-1          |     data_entity_lists = self._adapter.get_data_entity_list()
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/adapter.py", line 39, in get_data_entity_list
odd-collector-1          |     items=self.get_data_entities(),
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/adapter.py", line 28, in get_data_entities
odd-collector-1          |     return map_table(
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/mappers/tables.py", line 73, in map_table
odd-collector-1          |     column_data_fields = build_dataset_fields(
odd-collector-1          |   File "/app/odd_collector/adapters/clickhouse/mappers/columns.py", line 144, in build_dataset_fields
odd-collector-1          |     type_tree = parser.parse(column.type)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/lark.py", line 581, in parse
odd-collector-1          |     return self.parser.parse(text, start=start, on_error=on_error)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parser_frontends.py", line 106, in parse
odd-collector-1          |     return self.parser.parse(stream, chosen_start, **kw)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/earley.py", line 297, in parse
odd-collector-1          |     to_scan = self._parse(lexer, columns, to_scan, start_symbol)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/xearley.py", line 144, in _parse
odd-collector-1          |     to_scan = scan(i, to_scan)
odd-collector-1          |   File "/usr/local/lib/python3.9/site-packages/lark/parsers/xearley.py", line 118, in scan
odd-collector-1          |     raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},
odd-collector-1          | lark.exceptions.UnexpectedCharacters: No terminal matches ''' in the current parser context, at line 1 col 15
odd-collector-1          | 
odd-collector-1          | DateTime64(3, 'GMT')
odd-collector-1          |               ^
odd-collector-1          | Expected one of: 
odd-collector-1          |      * TIMEZONE
odd-collector-1          | 
odd-collector-1          | 
odd-collector-1          | 2025-01-30 09:27:19.082 | ERROR    | odd_collector_sdk.job:log_execution:25 - [clickhouse_adapter] failed.
odd-collector-1          |  No terminal matches ''' in the current parser context, at line 1 col 15
odd-collector-1          | 
odd-collector-1          | DateTime64(3, 'GMT')
odd-collector-1          |               ^
odd-collector-1          | Expected one of: 
odd-collector-1          |      * TIMEZONE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants