-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support for log-level filtering of structured IR streams. #35
Conversation
Co-authored-by: Junhao Liao <[email protected]>
WalkthroughThe changes in this pull request involve multiple updates across several files, primarily focusing on the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (16)
src/clp_ffi_js/constants.hpp (1)
22-22
: Consider adding range validation helpers.To make the valid range more useful, consider adding constexpr helper functions to validate log levels.
Here's a suggested addition:
constexpr bool IsValidLogLevel(LogLevel level) { return level >= cValidLogLevelsBeginIdx && level <= cValidLogLevelsEndIdx; }src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1)
Line range hint
24-28
: Consider updating class documentationSince this class now handles filtering capabilities through
LogEventWithFilterData
, it would be beneficial to update the class documentation to reflect these changes and document the filtering behaviour.src/clp_ffi_js/ir/StreamReader.hpp (1)
26-31
: Consider adding validation methods for the mapping.Since this type is central to the filtering mechanism, consider adding utility methods to:
- Validate index bounds
- Handle mapping updates when log events are modified
- Provide iterator access for efficient traversal
Example interface additions:
class StreamReader { // ... existing code ... /** * Validates that all indices in the mapping are within bounds * @return true if all indices are valid */ [[nodiscard]] virtual bool validate_mapping() const = 0; /** * Updates the mapping after log event modifications * @param old_index The original index * @param new_index The new index * @return true if update was successful */ virtual bool update_mapping(size_t old_index, size_t new_index) = 0; };src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1)
130-135
: Consider moving the log_event parameter.The implementation is correct, but there's a potential performance optimization opportunity.
Consider this minor optimization:
- auto log_event_with_filter_data{LogEventWithFilterData<UnstructuredLogEvent>( - log_event, - log_level, - log_event.get_timestamp() - )}; + auto log_event_with_filter_data{LogEventWithFilterData<UnstructuredLogEvent>( + std::move(log_event), + log_level, + log_event.get_timestamp() + )};src/clp_ffi_js/ir/StructuredIrUnitHandler.hpp (2)
97-103
: Correct the@return
description inget_log_level
documentationThe
@return
comment forget_log_level
incorrectly mentions "Timestamp" instead of "Log level".Apply this diff to fix the documentation:
/** * @param id_value_pairs - * @return Timestamp from `StructuredLogEvent` + * @return Log level from `StructuredLogEvent` */
120-123
: Address the TODO regardinggsl::not_null
usageThe TODO suggests replacing
std::shared_ptr
withgsl::not_null
oncegsl
is added to the project to reflect that the pointer should not be null.Would you like assistance in implementing this change, or should we create a GitHub issue to track this enhancement?
src/clp_ffi_js/ir/StructuredIrUnitHandler.cpp (4)
144-144
: Implement parsing for timestamp values of string typeThe TODO comment indicates that support for parsing timestamp values of string type is pending. Implementing this will enhance the flexibility of timestamp handling.
Would you like assistance in developing the parsing logic for timestamp strings?
110-112
: Handle unrecognized log level names after parsingAfter parsing the log level string using
parse_log_level
, if the result isLogLevel::NONE
, it implies an unrecognized log level. Consider logging a warning or handling this case to alert about unexpected log level values.
54-55
: Log a message when log level parsing failsWhen
parse_log_level
does not find a matching log level name, it currently returnsLogLevel::NONE
silently. Consider logging an informative message to aid in debugging issues related to unrecognized log level strings.
38-44
: Be cautious with character case conversion and localeThe use of
std::toupper
in converting the log level string to uppercase may not handle non-ASCII characters correctly. If internationalization is a concern, consider using locale-aware functions or specifying the locale explicitly to ensure proper case conversion.src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (1)
Line range hint
135-138
: Inconsistent handling of log level filtering indecode_range
Despite adding filtering capabilities elsewhere,
decode_range
still logs an error stating that log level filtering is not supported. Consider updatingdecode_range
to support filtering or modifying the error message to reflect current functionality.Apply this diff to enable filtering in
decode_range
:- if (use_filter) { - SPDLOG_ERROR(cLogLevelFilteringNotSupportedErrorMsg); - return DecodedResultsTsType{emscripten::val::null()}; - } + if (use_filter) { + // Apply filtering logic here + // For example, adjust the range based on filtered indices + // ... (implementation details) + }src/clp_ffi_js/ir/LogEventWithFilterData.hpp (5)
46-47
: Re-evaluate the need to delete copy constructor and copy assignment operatorDeleting the copy constructor and copy assignment operator restricts the ability to copy
LogEventWithFilterData
instances. If copying of this class is acceptable and safe, consider allowing the default copy operations. If there are specific reasons to prevent copying (e.g., to avoid accidental copying of heavy resources), ensure that this aligns with the intended use of the class.
50-51
: Explicitly defaulted move operations may be redundantThe move constructor and move assignment operator are explicitly defaulted on lines 50-51. Since these would be implicitly generated by the compiler when copy operations are deleted, you may omit these explicit defaults unless there is a specific need to declare them.
53-54
: Destructor can be omittedThe destructor is explicitly defaulted on lines 53-54. If there is no custom destructor logic required, you can omit this declaration and allow the compiler to generate the destructor implicitly.
14-17
: Consider centralizing type aliases for log eventsThe type aliases
UnstructuredLogEvent
andStructuredLogEvent
are defined here. If these aliases are used across multiple files, consider moving them to a common header file to improve maintainability and reduce duplication.
1-2
: Header guard naming conventionThe header guard
CLP_FFI_JS_IR_LOGEVENTWITHFILTERDATA_HPP
lacks underscores between words, which may reduce readability. Consider renaming it toCLP_FFI_JS_IR_LOG_EVENT_WITH_FILTER_DATA_HPP
to align with common naming conventions and enhance clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (10)
src/clp_ffi_js/constants.hpp
(1 hunks)src/clp_ffi_js/ir/LogEventWithFilterData.hpp
(1 hunks)src/clp_ffi_js/ir/LogEventWithLevel.hpp
(0 hunks)src/clp_ffi_js/ir/StreamReader.hpp
(1 hunks)src/clp_ffi_js/ir/StructuredIrStreamReader.cpp
(5 hunks)src/clp_ffi_js/ir/StructuredIrStreamReader.hpp
(2 hunks)src/clp_ffi_js/ir/StructuredIrUnitHandler.cpp
(1 hunks)src/clp_ffi_js/ir/StructuredIrUnitHandler.hpp
(1 hunks)src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp
(5 hunks)src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
(2 hunks)
💤 Files with no reviewable changes (1)
- src/clp_ffi_js/ir/LogEventWithLevel.hpp
🧰 Additional context used
📓 Learnings (3)
src/clp_ffi_js/ir/LogEventWithFilterData.hpp (1)
Learnt from: davemarco
PR: y-scope/clp-ffi-js#32
File: src/clp_ffi_js/ir/LogEventWithFilterData.hpp:0-0
Timestamp: 2024-11-10T02:14:19.411Z
Learning: In the context of `LogEventWithFilterData`, it is acceptable for `log_level` to be `LogLevel::NONE`; additional validation against `LogLevel::NONE` is not necessary.
src/clp_ffi_js/ir/StreamReader.hpp (1)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-04T19:35:32.602Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
🔇 Additional comments (24)
src/clp_ffi_js/constants.hpp (2)
22-22
: LGTM! The constant addition is well-placed and properly typed.
The new constant cValidLogLevelsEndIdx
pairs nicely with the existing cValidLogLevelsBeginIdx
to define the valid range of log levels.
22-22
: Verify the constant usage across the codebase.
Since this PR is implementing filtering for structured reader, let's verify how this constant is being used.
✅ Verification successful
Let me gather more information about the usage context.
The constant cValidLogLevelsEndIdx
is properly used for log level validation
The constant is used correctly in StructuredIrUnitHandler.cpp
to validate log levels during deserialization, ensuring they fall within the valid range (TRACE to FATAL). The implementation properly converts and validates the log level before using it in the LogEventWithFilterData
structure.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check usage of the new constant and related filtering logic
# Search for references to the new constant
rg -l "cValidLogLevelsEndIdx"
# Look for potential filter implementations using log levels
ast-grep --pattern 'class $_ {
$$$
LogLevel $$$
$$$
}'
Length of output: 167
Script:
#!/bin/bash
# Check the implementation details in StructuredIrUnitHandler.cpp
rg -A 5 "cValidLogLevelsEndIdx" src/clp_ffi_js/ir/StructuredIrUnitHandler.cpp
# Look for other log level related constants and their usage
rg "LogLevel::" src/clp_ffi_js/constants.hpp
# Check for any filter implementations
rg -A 10 "filter" src/clp_ffi_js/ir/StructuredIrUnitHandler.cpp
Length of output: 1179
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (2)
22-22
: LGTM! Good type alias addition
The new type alias improves code readability and maintains consistency with the codebase naming conventions.
83-83
: Verify the impact of type change on existing methods
The type change from LogEventWithLevel
to LogEventWithFilterData
for m_encoded_log_events
might affect existing methods that interact with this member. Please ensure all usages are updated accordingly.
✅ Verification successful
Let me analyze the results and generate additional verification scripts to check for any potential issues.
Based on the verification results, I can now provide a final response:
Type change to LogEventWithFilterData
is properly handled
The change from LogEventWithLevel
to LogEventWithFilterData
for m_encoded_log_events
has been properly implemented across all usages:
- All accesses to
m_encoded_log_events
are consistent with the new type - No references to the old
LogEventWithLevel
type remain in the codebase - The member is used correctly in all methods including:
get_num_events_buffered()
filter_log_events()
deserialize_stream()
decode_range()
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for methods accessing m_encoded_log_events
rg -A 3 "m_encoded_log_events" --type cpp
# Search for any remaining references to LogEventWithLevel
rg "LogEventWithLevel" --type cpp
Length of output: 3337
Script:
#!/bin/bash
# Search for the LogEventWithFilterData type definition and usage
ast-grep --pattern 'template<typename T> class LogEventWithFilterData { $$$ }'
# Search for any references to the old type pattern
ast-grep --pattern 'LogEventWithLevel<$_>'
# Look for method signatures that might be affected
rg "UnstructuredIrStreamReader::" --type cpp -A 2
Length of output: 2720
src/clp_ffi_js/ir/StreamReader.hpp (2)
26-30
: Well-designed type alias with clear documentation.
The FilteredLogEventsMap
type alias is well-chosen:
- Using
std::optional
appropriately handles cases where no mapping exists - Using
std::vector<size_t>
provides efficient index-based mapping - Documentation clearly explains the purpose
26-31
: Verify the impact on memory usage.
The mapping could potentially consume significant memory for large log files. Consider implementing a streaming approach or pagination mechanism for better memory efficiency.
✅ Verification successful
Let me analyze the code further to verify the memory usage concerns.
Memory usage concerns are already addressed with proper optimizations
The codebase already implements memory optimizations:
- Pre-allocation with
reserve()
is used to avoid reallocation overhead, with a reasonable default size of 500,000 events - The
StreamReader
design implements streaming throughdeserialize_stream()
anddecode_range()
methods - The
FilteredLogEventsMap
is anstd::optional
, meaning it's only allocated when filtering is actually used - Memory is properly released after stream exhaustion via
reset(nullptr)
on the stream reader context
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for large log file handling patterns
rg -A 5 "pagination|chunk|batch|stream" --type cpp
Length of output: 31871
Script:
#!/bin/bash
# Search for memory management patterns and vector reservations
rg -A 3 "reserve|capacity|memory" --type cpp
# Search for pagination or batch processing implementations
ast-grep --pattern 'class $_ {
$$$
decode_range($_, $_, $_) {
$$$
}
$$$
}'
Length of output: 3627
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (3)
26-26
: LGTM! Include changes are appropriate.
The addition of LogEventWithFilterData header and reorganization of includes align well with the new implementation.
Also applies to: 29-29
75-75
: LGTM! Good simplification of filtering logic.
The delegation to filter_deserialized_events
improves code organization and maintainability. As per previous learnings from PR #26, TypeScript typing ensures safe log level filter inputs.
Line range hint 169-193
: LGTM! Proper adaptation to the new data structure.
The changes correctly handle the LogEventWithFilterData structure while maintaining robust error handling and proper timestamp formatting.
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp (6)
14-14
: Include of LogEventWithFilterData.hpp
is appropriate.
The inclusion of LogEventWithFilterData.hpp
is necessary for handling log events with filtering capabilities.
17-17
: Include of StructuredIrUnitHandler.hpp
is appropriate.
Including StructuredIrUnitHandler.hpp
ensures access to the new unit handler for structured IR units.
21-21
: Type alias StructuredIrDeserializer
is correctly defined.
Defining StructuredIrDeserializer
simplifies references to the deserializer specialized with StructuredIrUnitHandler
.
81-82
: Constructor parameter updated to handle filtered log events.
The constructor now accepts a std::shared_ptr
to a vector of LogEventWithFilterData<StructuredLogEvent>
, aligning with the new filtering mechanism.
86-87
: Member variable m_deserialized_log_events
updated appropriately.
Updating m_deserialized_log_events
to store log events with filter data matches the changes in the constructor and supports the enhanced processing of log events.
89-89
: Addition of m_filtered_log_event_map
supports new filtering features.
Introducing m_filtered_log_event_map
provides the necessary mapping for filtered log events, which is essential for the filtering functionality.
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (9)
23-27
: Includes updated to support new filtering functionality
The necessary headers have been appropriately included to support the new filtering capabilities.
35-36
: Added constants for reader options keys
The constants cReaderOptionsLogLevelKey
and cReaderOptionsTimestampKey
are correctly defined for accessing reader options.
44-46
: Initialized deserialized_log_events
with updated data structure
The shared pointer to the vector of LogEventWithFilterData<StructuredLogEvent>
is properly initialized.
81-85
: Properly handling optional m_filtered_log_event_map
The method correctly checks if m_filtered_log_event_map
has a value before attempting to access it, ensuring safe operation.
89-93
: Simplified call to filter_deserialized_events
The filter_log_events
method now directly calls filter_deserialized_events
with the appropriate parameters, enhancing code clarity.
147-148
: Accessing log events using updated data structure
The code correctly retrieves the log event from log_event_with_filter_data
using the new data structure.
150-150
: Serializing structured log event to JSON
The serialize_to_json
method is appropriately called on the structured log event.
167-168
: Passing timestamp and log level to JavaScript context
The timestamp and log level are correctly obtained and passed to the JavaScript context via EM_ASM
.
178-179
: Constructor updated with new log event structure
The constructor now accepts deserialized_log_events
as a shared pointer to a vector of LogEventWithFilterData<StructuredLogEvent>
, aligning with the updated data structures.
#include <clp/ir/LogEvent.hpp> | ||
#include <clp/ir/LogEventDeserializer.hpp> | ||
#include <clp/ir/types.hpp> | ||
#include <clp/TimestampPattern.hpp> | ||
#include <emscripten/val.h> | ||
|
||
#include <clp_ffi_js/ir/LogEventWithLevel.hpp> | ||
#include <clp_ffi_js/ir/LogEventWithFilterData.hpp> | ||
#include <clp_ffi_js/ir/StreamReader.hpp> | ||
#include <clp_ffi_js/ir/StreamReaderDataContext.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restore the <optional>
header inclusion
The <optional>
header appears to be still required as the class member m_filtered_log_event_map
is of type FilteredLogEventsMap
which likely uses std::optional
. Its removal might cause compilation errors.
+#include <optional>
#include <clp/ir/LogEvent.hpp>
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
#include <clp/ir/LogEvent.hpp> | |
#include <clp/ir/LogEventDeserializer.hpp> | |
#include <clp/ir/types.hpp> | |
#include <clp/TimestampPattern.hpp> | |
#include <emscripten/val.h> | |
#include <clp_ffi_js/ir/LogEventWithLevel.hpp> | |
#include <clp_ffi_js/ir/LogEventWithFilterData.hpp> | |
#include <clp_ffi_js/ir/StreamReader.hpp> | |
#include <clp_ffi_js/ir/StreamReaderDataContext.hpp> | |
#include <optional> | |
#include <clp/ir/LogEvent.hpp> | |
#include <clp/ir/LogEventDeserializer.hpp> | |
#include <clp/ir/types.hpp> | |
#include <clp/TimestampPattern.hpp> | |
#include <emscripten/val.h> | |
#include <clp_ffi_js/ir/LogEventWithFilterData.hpp> | |
#include <clp_ffi_js/ir/StreamReader.hpp> | |
#include <clp_ffi_js/ir/StreamReaderDataContext.hpp> |
}; | ||
} // namespace clp_ffi_js::ir | ||
|
||
#endif // CLP_FFI_JS_IR_STRUCTUREDIRSTREAMREADER_HPP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the inconsistent header guard macro
The header guard macro at the end of the file does not match the one at the beginning, which may lead to compilation issues.
Apply this diff to correct the header guard:
-#endif // CLP_FFI_JS_IR_STRUCTUREDIRSTREAMREADER_HPP
+#endif // CLP_FFI_JS_IR_STRUCTUREDIRUNITHANDLER_HPP
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
#endif // CLP_FFI_JS_IR_STRUCTUREDIRSTREAMREADER_HPP | |
#endif // CLP_FFI_JS_IR_STRUCTUREDIRUNITHANDLER_HPP |
if (log_level_node_value <= (clp::enum_to_underlying_type(cValidLogLevelsEndIdx))) { | ||
log_level = static_cast<LogLevel>(log_level_node_value); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate lower bound when checking integer log level values
In the get_log_level
method, when validating the integer value of the log level, the code only checks if log_level_node_value <= (clp::enum_to_underlying_type(cValidLogLevelsEndIdx))
. Please add a lower bound check to ensure the value falls within the valid range.
Apply this diff to add the lower bound check:
- if (log_level_node_value <= (clp::enum_to_underlying_type(cValidLogLevelsEndIdx))) {
+ if (log_level_node_value >= clp::enum_to_underlying_type(cValidLogLevelsBeginIdx) &&
+ log_level_node_value <= clp::enum_to_underlying_type(cValidLogLevelsEndIdx)) {
log_level = static_cast<LogLevel>(log_level_node_value);
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (log_level_node_value <= (clp::enum_to_underlying_type(cValidLogLevelsEndIdx))) { | |
log_level = static_cast<LogLevel>(log_level_node_value); | |
} | |
if (log_level_node_value >= clp::enum_to_underlying_type(cValidLogLevelsBeginIdx) && | |
log_level_node_value <= clp::enum_to_underlying_type(cValidLogLevelsEndIdx)) { | |
log_level = static_cast<LogLevel>(log_level_node_value); | |
} |
reader_options[cReaderOptionsLogLevelKey.data()].as<std::string>(), | ||
reader_options[cReaderOptionsTimestampKey.data()].as<std::string>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Ensure presence of required keys in reader_options
The code assumes that reader_options
contains the keys logLevelKey
and timestampKey
. To prevent potential exceptions, consider adding error handling to verify that these keys are present.
Apply this diff to add key existence checks:
+if (!reader_options.hasOwnProperty(cReaderOptionsLogLevelKey.data()) ||
+ !reader_options.hasOwnProperty(cReaderOptionsTimestampKey.data())) {
+ throw ClpFfiJsException{
+ clp::ErrorCode::ErrorCode_InvalidArgument,
+ __FILENAME__,
+ __LINE__,
+ "Required reader options keys are missing"
+ };
+}
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create a issue for JavaScript support (usage without the TS definition file) and keep a list of TODOs there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao, I've created the issue as requested: Issue #36 to track JavaScript support without the TypeScript definition file and to keep a list of TODOs.
🔗 Analysis chain
@junhaoliao, I've created a new GitHub issue to track JavaScript support without the TypeScript definition file and to keep a list of TODOs.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Create a new GitHub issue for JavaScript support without the TypeScript definition file
gh issue create \
--title "JavaScript support without the TypeScript definition file" \
--body "We need to support JavaScript usage without relying on the TypeScript definition file.
This issue is created to keep a list of TODOs for implementing this feature.
Referenced from [PR #35](https://github.com/y-scope/clp-ffi-js/pull/35) and [this comment](https://github.com/y-scope/clp-ffi-js/pull/35#discussion_r1835929602) by @junhaoliao." \
--assignee junhaoliao
Length of output: 507
✏️ Learnings added
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#35
File: src/clp_ffi_js/ir/StructuredIrStreamReader.cpp:50-51
Timestamp: 2024-11-14T15:31:45.747Z
Learning: Ensure that `clp-ffi-js` supports JavaScript usage without relying on the TypeScript definition file, and keep track of related TODOs in a GitHub issue.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
// TODO: Extend valid log event types when filtering support is added for structured logs. | ||
template <typename T> | ||
concept ValidLogEventTypes | ||
= std::same_as<T, UnstructuredLogEvent> || std::same_as<T, StructuredLogEvent>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Update or remove outdated TODO comment regarding valid log event types
The TODO comment on line 19 suggests extending valid log event types when filtering support is added for structured logs. However, StructuredLogEvent
is already included in ValidLogEventTypes
on line 22. Please consider updating or removing the TODO comment to reflect the current state of the code.
bool use_filter | ||
) -> DecodedResultsTsType { | ||
if (use_filter && false == filtered_log_event_map.has_value()) { | ||
return DecodedResultsTsType{emscripten::val::null()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't in the log viewer code as well -
Can we add a log print before returning so that we know the input argument check fails here rather than at the range check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here i dont think so. This should be a normal case when a filter is applied, and the file does not have any logs with that filter level.
length = log_events.size(); | ||
} | ||
if (length < end_idx || begin_idx > end_idx) { | ||
return DecodedResultsTsType{emscripten::val::null()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see https://github.com/y-scope/clp-ffi-js/pull/35/files#r1884668582
it might be a good idea to add a log print here before returning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an error log here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested with a clp-ffi-py generated IRv2 file with various log levels. all good except
- logs can be printed before returning null in
auto StreamReader::generic_decode_range
: https://github.com/y-scope/clp-ffi-js/pull/35/files#r1884668582 - a pending discussion at https://github.com/y-scope/clp-ffi-js/pull/35/files#r1881142029
0581310
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
src/clp_ffi_js/ir/StreamReader.hpp (2)
38-39
: Add documentation for the LogEvents type aliasConsider adding a documentation comment explaining the purpose and usage of this type alias, similar to the documentation provided for FilteredLogEventsMap.
Add this documentation above the type alias:
+/** + * Collection of log events with their associated filter data (timestamp and log level). + */ template <typename LogEvent> using LogEvents = std::vector<LogEventWithFilterData<LogEvent>>;
246-256
: Consider extracting the filtering logic into a named lambda for better readabilityThe filtering logic could be made more readable by extracting it into a named lambda.
- for (size_t log_event_idx = 0; log_event_idx < log_events.size(); ++log_event_idx) { - auto const& log_event = log_events[log_event_idx]; - if (std::ranges::find( - filter_levels, - clp::enum_to_underlying_type(log_event.get_log_level()) - ) - != filter_levels.end()) - { - filtered_log_event_map->emplace_back(log_event_idx); - } - } + auto const matches_filter_level = [&filter_levels](auto const& log_event) { + return std::ranges::find( + filter_levels, + clp::enum_to_underlying_type(log_event.get_log_level()) + ) != filter_levels.end(); + }; + + for (size_t log_event_idx = 0; log_event_idx < log_events.size(); ++log_event_idx) { + if (matches_filter_level(log_events[log_event_idx])) { + filtered_log_event_map->emplace_back(log_event_idx); + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/clp_ffi_js/ir/StreamReader.hpp
(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/clp_ffi_js/ir/StreamReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (1)
src/clp_ffi_js/ir/StreamReader.hpp (2)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-10T16:46:58.543Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
🔇 Additional comments (2)
src/clp_ffi_js/ir/StreamReader.hpp (2)
59-60
: Add documentation for the reader_options parameter
The reader_options parameter is missing from the docstring.
174-229
: Well-implemented error handling and type safety
The implementation includes:
- Proper concept requirements for type safety
- Clear error handling with appropriate logging
- Efficient range validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments about documentation for the exception behaviour
For the PR title, how about
feat(StructuredIrStreamReader): Add support for log-level filtering.
return log_level; | ||
} | ||
|
||
auto StructuredIrUnitHandler::get_timestamp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be deduplicated with StructuredIrUnitHandler::get_log_level
, but for the sake of simplicity let's leave this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes perhaps in future could be a higher level function but logic is not entirely the same
auto const parsed{unstructured_log_event.get_message().decode_and_unparse()}; | ||
auto log_event_to_string = [this](UnstructuredLogEvent const& log_event) -> std::string { | ||
std::string message; | ||
auto const parsed{log_event.get_message().decode_and_unparse()}; | ||
if (false == parsed.has_value()) { | ||
throw ClpFfiJsException{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the structured one also throws the same error when a message cannot be decoded, let's document this @throw
in the docstring at
clp-ffi-js/src/clp_ffi_js/ir/StreamReader.hpp
Lines 100 to 101 in d2556b2
[[nodiscard]] virtual auto decode_range(size_t begin_idx, size_t end_idx, bool use_filter) const | |
-> DecodedResultsTsType = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this comment is superceded by #35 (comment) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion. I mean we want to add this at
*/ |
@throw ClpFfiJsException if any error occurs.
or this may be more clear:
@throw ClpFfiJsException if a message cannot be decoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davemarco in case you haven't been notified of this response
Co-authored-by: Junhao Liao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/clp_ffi_js/ir/StreamReader.hpp (2)
59-60
: Documentation needs more detail for reader_options parameterThe
reader_options
parameter documentation should specify that it requires bothlogLevelKey
andtimestampKey
fields, along with their purposes.- * @param reader_options + * @param reader_options Configuration object containing: + * - logLevelKey: Key to extract log level from structured logs + * - timestampKey: Key to extract timestamp from structured logs
206-227
: Consider reserving vector capacity for performanceThe results array's size is known before the loop begins. Consider pre-allocating the required capacity to avoid potential reallocations.
auto const results{emscripten::val::array()}; + EM_ASM( + { Emval.toValue($0).length = $1; }, + results.as_handle(), + end_idx - begin_idx + );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/clp_ffi_js/ir/StreamReader.hpp
(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/clp_ffi_js/ir/StreamReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (1)
src/clp_ffi_js/ir/StreamReader.hpp (2)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-10T16:46:58.543Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
🔇 Additional comments (3)
src/clp_ffi_js/ir/StreamReader.hpp (3)
38-45
: Well-designed type aliases that improve code clarity!
The introduction of LogEvents
and FilteredLogEventsMap
type aliases enhances code readability and maintainability. The documentation for FilteredLogEventsMap
clearly explains its purpose.
189-202
: Robust error handling with appropriate logging!
The implementation properly handles edge cases and includes helpful error logging for invalid ranges.
232-258
: Clean and efficient filtering implementation!
The implementation makes good use of modern C++ features like std::ranges and properly handles memory management. The null check for log_level_filter provides appropriate error handling.
re: title I think the current title is better. I think adding |
src/clp_ffi_js/ir/StreamReader.hpp
Outdated
* @param use_filter | ||
* @param log_event_to_string | ||
* @return See `decode_range`. | ||
* @throw Propagates exceptions from `ToStringFunc`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggested wording was from the Coding Guidelines. Might be good to keep that for consistency: https://www.notion.so/yscope/WIP-Coding-Guidelines-9a308b847a5343958ba3cb97a850be66?pvs=4#13504e4d9e6b80d3bc29c87e930fcee1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk i fixed
Right, the scope is optional in Conventional Commit messages. Sure let's leave the PR title as is |
I added change you suggested for #35 (comment) . I also added a comment for deserialize stream, since that can also throw exception. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/clp_ffi_js/ir/StreamReader.hpp (2)
38-39
: Add documentation for the LogEvents type alias.Please add a documentation block explaining the purpose and usage of this type alias, similar to how FilteredLogEventsMap is documented.
+/** + * Collection of log events with associated filter data. + */ template <typename LogEvent> using LogEvents = std::vector<LogEventWithFilterData<LogEvent>>;
191-193
: Consider using more idiomatic condition style.The condition could be written more idiomatically.
- if (use_filter && false == filtered_log_event_map.has_value()) { + if (use_filter && !filtered_log_event_map.has_value()) { return DecodedResultsTsType{emscripten::val::null()}; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/clp_ffi_js/ir/StreamReader.hpp
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/clp_ffi_js/ir/StreamReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (1)
src/clp_ffi_js/ir/StreamReader.hpp (2)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-10T16:46:58.543Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
🔇 Additional comments (2)
src/clp_ffi_js/ir/StreamReader.hpp (2)
59-60
: Add documentation for the reader_options parameter.
The reader_options parameter is missing from the docstring.
* @param data_array An array containing a Zstandard-compressed IR stream.
- * @param reader_options
+ * @param reader_options Configuration options for the reader including logLevelKey and timestampKey.
* @return The created instance.
234-260
: Implementation looks good!
The generic_filter_log_events implementation is well-structured, using modern C++ features and proper error handling. The use of std::ranges::find and emscripten::vecFromJSArray is appropriate for the task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/clp_ffi_js/ir/StreamReader.hpp (3)
59-60
: Enhance reader_options parameter documentation.The
reader_options
parameter documentation should describe its structure, particularly the requiredlogLevelKey
andtimestampKey
fields.
165-167
: Remove duplicate @param log_events tag.The parameter
log_events
is documented twice in the function description.
191-193
: Consider simplifying the null check.While the current implementation follows the coding guideline for comparison order, consider using
!filtered_log_event_map
for better readability sincestd::optional
provides a boolean conversion operator.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/clp_ffi_js/ir/StreamReader.hpp
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/clp_ffi_js/ir/StreamReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (1)
src/clp_ffi_js/ir/StreamReader.hpp (2)
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/StreamReader.cpp:35-45
Timestamp: 2024-11-10T16:46:58.543Z
Learning: An IRv1 version validation utility function is not planned to be added to CLP-core soon.
Learnt from: junhaoliao
PR: y-scope/clp-ffi-js#26
File: src/clp_ffi_js/ir/IrStreamReader.cpp:58-59
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In the `IrStreamReader::filter_log_events` method in `src/clp_ffi_js/ir/IrStreamReader.cpp`, the wrapper code is typed with TypeScript, ensuring that non-array inputs for `log_level_filter` would fail static checks, so additional runtime error handling is unnecessary.
🔇 Additional comments (2)
src/clp_ffi_js/ir/StreamReader.hpp (2)
38-45
: Well-designed type aliases that improve code clarity!
The new type aliases LogEvents
and FilteredLogEventsMap
effectively simplify complex types and provide clear semantic meaning. The use of std::optional
for FilteredLogEventsMap
is particularly appropriate for representing the presence or absence of filtered events.
249-259
: Well-implemented filtering logic using modern C++ features!
The use of std::ranges::find
with clp::enum_to_underlying_type
is an efficient and type-safe approach for filtering log events.
I just retested with IRV2 and IRV1 files, and was able to open and filter files with no issues. |
Description
Adds log level filtering for structured reader.
Design is similiar to junhao prototype. We retrieve log level and timestamp node id from the new node ir unit as the stream is being deserialized. Then we use the cached node id to retrieve values from the logs.
Difference from prototype is that parsing is done in deserialization stage.
I also added support for both string and int log levels.
Validation performed
I tested on one log file with a single int log level..."test-irv2"
@junhaoliao, if we do need string log level, is there a test file for it?
Summary by CodeRabbit
New Features
StructuredIrUnitHandler
for improved log event processing.ReaderOptions
structure to includelogLevelKey
.Bug Fixes
Refactor
filter_log_events
anddecode_range
methods for better maintainability.IrUnitHandler
class to streamline log event handling.Chores