Skip to content

woothee/fluent-plugin-woothee

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4a8d155 · Feb 13, 2017

History

43 Commits
Feb 13, 2017
Feb 13, 2017
Jul 20, 2012
Oct 27, 2016
Jul 20, 2012
Jul 20, 2012
Feb 13, 2017
Jul 20, 2012
Feb 13, 2017

Repository files navigation

fluent-plugin-woothee

'fluent-plugin-woothee' is a Fluentd filter plugin to parse UserAgent strings and to filter/drop specified categories of user terminals (like 'pc', 'smartphone' and so on).

'woothee' is multi-language user-agent strings parser project. See: https://github.com/woothee/woothee

Configuration

To add woothee parser result into messages:

<label @accesslog>
  <filter input.**>
    @type woothee
    key_name agent
    merge_agent_info yes
  </filter>
  <match ...>
  </match>
</label>

Result messages has attributes like 'agent_name', 'agent_category' and 'agent_os' from woothee parser result. If you want to change attribute names, or want to merge more attributes of browser vendor and its version, write configurations as below:

<label @accesslog>
  <filter input.**>
    @type woothee
    key_name agent
    merge_agent_info yes
    
    out_key_name ua_name
    out_key_category ua_category
    out_key_os ua_os
    out_key_os_version ua_os_version
    out_key_version ua_version
    out_key_vendor ua_vendor
  </filter>
  <match ...>
  </match>
</label>

To pass messages only with specified user-agent categories (and merge woothee parser result), configure like this:

<label @accesslog>
  <filter input.**>
    @type woothee
    key_name agent
    merge_agent_info yes
    filter_categories pc,smartphone,mobilephone,appliance
  </filter> # logs of other categories will be dropped
  
  # ...
</label>

Or, you can specify categories to drop (and not to merge woothee result):

<label @accesslog>
  <filter input.**>
    @type woothee
    key_name agent
    merge_agent_info false # default
    drop_categories crawler
  </filter>
  
  # ...
</label>

Fast Crawler Filter

If you want to drop almost all of messages with crawler's user-agent, and not to merge woothee result, you just specify plugin type:

<filter input.**>
  @type woothee_fast_crawler_filter
  key_name useragent
</filter>

'fluent-plugin-woothee' uses 'Woothee.is_crawler' of woothee with this configuration, fast and incomplete method to judge user-agent is crawler or not. If you want to drop all of crawlers completely, specify 'type woothee' and 'drop_categories crawler'.

Output plugin

The output version of woothee plugin is not supported in versions for Fluentd v0.14.

TODO

  • patches welcome!

Copyright

  • Copyright (c) 2012- TAGOMORI Satoshi (tagomoris)
  • License
    • Apache License, Version 2.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages