A practical guide to Working Set Analysis Logging Library
This page describes about how you can add logging supported by Working Set Analysis Logging Library into your c++ code base.
Prerequisiteβ
Create the logger config in wwwβ
Follow this page to create a logger config in www repo (Hack), see example D25831871. Canonical column: https://www.internalfb.com/intern/logger/canonical_fields
Actualize your configβ
Run this command in www to actualize your config (create and initialize hive table).
After actualizing you config, there will be some restrictions to change your config
phps LoggerSync Actualize YourLoggerConfig && meerkat
You can search your table name in Scuba to see what you have created, you will find multiple tables in your table name with different suffix, you will only find your data in your table after 24 hours, but you may see data in table ending inc_archive
in hourly basis. For the meanings of the suffix, see this post.
Sync config to fbcodeβ
cd fbsource/fbcode
dsi/logger/cpp/sync_logger config YourLoggerConfig
buck build dsi/logger/configs/YourLoggerConfig:logger
Modify the logger configβ
- Add new column (non-partition): this is simple, just add a new column in the php file.
- Add partition column / change paritions: this is not allowed, you have to rename your table (new table).
- Change column type: It's not recommended to do so, it's better to give it a new name (new column).
- For more, refer to this page
Overviewβ
Since you have a logger powered by config that takes care of actualiziation (creating underlying scribe/scuba/hive tables) for you, you may wonder what is stopping you from calling logger.log() wherever you want right now. WSALoggingLib provides two functionality:
- It supports different sampling rate based on criterium defined by you.
- It supports customized aggregation.
In order to power these two functionality, it requires you to provide more configs to the library and your customized logic to consume those configs.
Below is how logging with WSALoggingLib works. Let's call the application X
.
- When the application code has an event to log, it sends it to
XWorkingSetTracker
. XWorkingSetTracker
holdsXWorkingSetSampler
andXRequestLog
. Upon the receipt of an new event, it tests whether the event passes sampling with the sample. If it does, it forwards it toXRequestLog
.XRequestLog
holdsXRequestRecordAggregator
. Upon the receive of the event, it creates aXRequestRecord
based on the event. And it will be aggregated to the events with the same key.XRequestRecordAggregator
wraps around the actual logger that logs to remote service and holds a buffer of all records being aggregated in the aggregation interval. Once the aggregation interval is reached or the buffer fills up, it sends the aggregated records to the logging service (Hive) and flushes the buffer.
You will be defining all these classes whose name starts with X. But a lot of them just needs to be a definition that fills class names into already defined generic templates. And the logic required form you would mostly be related to your application's business logic.
What classes to createβ
You don't have to provide all of the above implementations from scratch. In fact, a lot of time you just need to create a class by specifying class names into generic classes provided by WSALoggingLib.
This diff D21512456 contains all the classes you need to add to your code base to utilize WSALoggingLib. We'll group all these classes by their purpose and talk about each of them.
Samplingβ
There are classes that tells how we do sampling. Sampling means given an event, whether we add this event into aggregation and eventually log to remote logger. Sampling is captured by WarmStorageSampler.h. The goal is that we want to create a subclass of Sampler
of specific types. In order to do so, we need to provide host info, telling the sampler the information of the current host (in order to select config from configerator), and a SamplerImpl, providing information about detail sampling implementation.
So below are the classes that we need to define/modify.
- WarmStorageHostInfo: This is the host info class which will be used to initialize the sampler, provided by the client application.
- WarmStorageSamplerImpl.cpp: WarmStorage samples by the blockID and host. SO we can use KvSamplerImpl for implementation. All we need to do is to supply a static function of
graphene::ws::KvSamplerImpl::buildNewSampler
, which provides the logic to create aKvSamplerImpl
from host information.- WarmStorageSamplerImpl.cpp can be renamed to anything as long as it provides KvSamplerImpl::buildNewSampler.
- WarmStorageSampler.h: Just to define the class
WarmStorageSampler
Please notice that the key for the sample could be different from the record key that we would talk about later. The sampler does not store any state about the keys it sample.
Behind the sceneβ
- The base class
Sampler
contains logic to interact with configs changes. It supports subscription to a configerator config and handles thread safety of the rotation of config updates. - When we call
warmStorageSmampler.sampleRequest
, sampler finds the current active samplerImpl (https://fburl.com/diffusion/5m8w6iyn). Since we are calling with folly::StringPiece, KvSamplerImpl will use this sample function, which uses furcHash on the string key.
Recordβ
WarmStorageRequestRecord.h defines the record class. Every WSALoggingLib record class is a graphene::ws::RequestRecord of
certain key type. WarmStorageRequestKey is the unit of aggregation, which means within the update interval, all the records with the same key will be merged together. For warm storage, every field of the record is part of the key. So when merge happens, only the opCount field of the base RequestRecord base class will be incremented.
In another example, ZippyDBRecord has field that are merged in its own merge
function.
Aggregatorβ
To user the aggregation, we need to define the following classes:
- WarmStorageRequestRecordAggregator =
RequestRecordAggregator<WarmStorageRequestRecord>
. This simply defines an interface class that consumes the record class. - WarmStorageRequestRecordAggregatorHive: This is the implementation of the aggregator that finally logs the aggregated records to Hive. This class also populates host information into the hive record.
- WarmStorageRequestLog =
RequestLog<WarmStorageRequestRecord, folly::SpinLock>
That's it. There is very little myth to implement since all the complicated processing are hidden in the base class below.
Behind the sceneβ
- The base class RequestLog contains the core logic for aggregation. This class stores data in a number of shards. Each shard contains a map from keys to records. The reason for sharding is to reduce locking contention.
- Each shard has its own memory arena to store variable sized keys (strings).
- A thread runs an aggregation loop that collects shards that can be aggregated and log to remote logging and flush the memory.
- When the a new event comes, recordOperation gets called and it is where record merge happens.
APIβ
The lass class we need to define is the interface class where the application holds a reference and call into.
The class WarmStorageWorkingSetTracker
contains the WarmStorageSampler and WarmStorageRequestLog. It provides one customized method logOP that can be called by the application to send an event. We initialize this class with some parameters that it feeds into either the sample or request log:
- nShards: Number of shards for the RequestLog. This number does not affect the overall size since the below max configs are for overall not per shard.
- collectionInterval: Default collection time interval. Records will be flushed at least once per this time interval.
- maxBacklog: Largest number of aggregated records we keep before flushing.
- maxBufferSize: The size of buffer (arena size) that stores the variable sized keys. This is roughly the memory "overhead" by WSALoggingLib.
Example of logging in application: https://fburl.com/diffusion/k3rsv2m0
Unit testsβ
Please make sure you write unit tests! Examples: https://fburl.com/diffusion/qnb6p95n
You can create mock aggregator to avoid logging to hive and to examine what you actually logged.
Canary Testβ
To test your code in production data, you need to setup a canary testing environement, build your pkg and deploy it to the canary machine. The following steps/examples are for WarmStorage, other applications may be different.
- Ask a WarmStorage engineer (or @haowux) to set up a canary test cluster. You will get a cluster name, e.g.
ws.sandbox.denny07
- Build the package. Make sure you have cloned configerator repo then run
fbpkg build -E warm_storage.storage_service --expire 28d
When you finished this sucessfully, you can see a finished progress bar like============
and under that you can find your packagewarm_storage.storage_service:xxxxxx
You will be prompt for permission for the first time, request for two months. Then run the above command again it will succeed. You should see the link to request permission in error message, but if not, use this link.
Run buck clean
before fbpkg
to avoid running out of space in your dev server.
Canary. We are canarying to
denny07
:link. Choose one task/host from the cluster, lets say we are using task 1, find the host name from above link, e.g.warmstorage114.04.cln2
Canary the configerator config. Make your configerator change on the sampling config by adding sampling rate 1 to cluster
denny07
(rate 1 means chossing all). e.g. D25934246
arc build
hg commit
arc canary --hosts warmstorage114.04.cln2
- now canary to ws
sf canary --tw-job priv2_cln/warm_storage/ws.sandbox.denny07.storage_service --tasks 1 --duration 1d --sfid warm_storage/storage_service --num-control-tasks 0 -V warm_storage.storage_service:xxxxxx --canary-task-extra-args='--sfn_cachelib_enable_ml_admission --sfn_cachelib_store_features --sfn_cachelib_ml_admission_metadata_override="conveyor_ash8strontium_20201111" --sfn_cachelib_ml_admission_target_recall_override=0.7'
You will be prompt for permission for the first time, request for two months. Then run the above command again it will succeed. You should see the link to request permission in error message, but if not, use this link
What configs to createβ
There are two configs in configerator that needs to be supplied:
- Sampling config definition: D21534544
- Adding specific sampling config for your hosts: D21695295
- This config can be picked up without restarting hosts if you set up the sampler by subscription.
- You may even define a host level config to test out your sampling by canarying this configerator change to one single host.