Qlib记录器:实验管理系统¶
简介¶
Qlib
包含一个名为 QlibRecorder
的实验管理系统,旨在帮助用户高效处理实验并分析结果。
该系统包含三个组件:
- ExperimentManager
管理实验的类。
- Experiment
实验类,每个实例负责一个单独的实验。
- Recorder
记录器类,每个实例负责单次运行记录。
以下是该系统结构的概览:
ExperimentManager
- Experiment 1
- Recorder 1
- Recorder 2
- ...
- Experiment 2
- Recorder 1
- Recorder 2
- ...
- ...
This experiment management system defines a set of interface and provided a concrete implementation MLflowExpManager
, which is based on the machine learning platform: MLFlow
(link).
If users set the implementation of ExpManager
to be MLflowExpManager
, they can use the command mlflow ui to visualize and check the experiment results. For more information, please refer to the related documents here.
Qlib记录器¶
QlibRecorder
provides a high level API for users to use the experiment management system. The interfaces are wrapped in the variable R
in Qlib
, and users can directly use R
to interact with the system. The following command shows how to import R
in Python:
from qlib.workflow import R
QlibRecorder
提供了若干常用API用于管理工作流中的 experiments 和 recorders。更多可用API请参阅后续关于 实验管理器、Experiment 和 Recorder 的章节。
以下是 QlibRecorder
的可用接口:
- class qlib.workflow.__init__.QlibRecorder(exp_manager: ExpManager)¶
A global system that helps to manage the experiments.
- __init__(exp_manager: ExpManager)¶
- start(*, experiment_id: str | None = None, experiment_name: str | None = None, recorder_id: str | None = None, recorder_name: str | None = None, uri: str | None = None, resume: bool = False)¶
Method to start an experiment. This method can only be called within a Python's with statement. Here is the example code:
# start new experiment and recorder with R.start(experiment_name='test', recorder_name='recorder_1'): model.fit(dataset) R.log... ... # further operations # resume previous experiment and recorder with R.start(experiment_name='test', recorder_name='recorder_1', resume=True): # if users want to resume recorder, they have to specify the exact same name for experiment and recorder. ... # further operations
- 参数:
experiment_id (str) -- id of the experiment one wants to start.
experiment_name (str) -- name of the experiment one wants to start.
recorder_id (str) -- id of the recorder under the experiment one wants to start.
recorder_name (str) -- name of the recorder under the experiment one wants to start.
uri (str) -- The tracking uri of the experiment, where all the artifacts/metrics etc. will be stored. The default uri is set in the qlib.config. Note that this uri argument will not change the one defined in the config file. Therefore, the next time when users call this function in the same experiment, they have to also specify this argument with the same value. Otherwise, inconsistent uri may occur.
resume (bool) -- whether to resume the specific recorder with given name under the given experiment.
- start_exp(*, experiment_id=None, experiment_name=None, recorder_id=None, recorder_name=None, uri=None, resume=False)¶
Lower level method for starting an experiment. When use this method, one should end the experiment manually and the status of the recorder may not be handled properly. Here is the example code:
R.start_exp(experiment_name='test', recorder_name='recorder_1') ... # further operations R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
- 参数:
experiment_id (str) -- id of the experiment one wants to start.
experiment_name (str) -- the name of the experiment to be started
recorder_id (str) -- id of the recorder under the experiment one wants to start.
recorder_name (str) -- name of the recorder under the experiment one wants to start.
uri (str) -- the tracking uri of the experiment, where all the artifacts/metrics etc. will be stored. The default uri are set in the qlib.config.
resume (bool) -- whether to resume the specific recorder with given name under the given experiment.
- 返回类型:
An experiment instance being started.
- end_exp(recorder_status='FINISHED')¶
Method for ending an experiment manually. It will end the current active experiment, as well as its active recorder with the specified status type. Here is the example code of the method:
R.start_exp(experiment_name='test') ... # further operations R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
- 参数:
status (str) -- The status of a recorder, which can be SCHEDULED, RUNNING, FINISHED, FAILED.
- search_records(experiment_ids, **kwargs)¶
Get a pandas DataFrame of records that fit the search criteria.
The arguments of this function are not set to be rigid, and they will be different with different implementation of
ExpManager
inQlib
.Qlib
now provides an implementation ofExpManager
with mlflow, and here is the example code of the method with theMLflowExpManager
:R.log_metrics(m=2.50, step=0) records = R.search_records([experiment_id], order_by=["metrics.m DESC"])
- 参数:
experiment_ids (list) -- list of experiment IDs.
filter_string (str) -- filter query string, defaults to searching all runs.
run_view_type (int) -- one of enum values ACTIVE_ONLY, DELETED_ONLY, or ALL (e.g. in mlflow.entities.ViewType).
max_results (int) -- the maximum number of runs to put in the dataframe.
order_by (list) -- list of columns to order by (e.g., “metrics.rmse”).
- 返回:
A pandas.DataFrame of records, where each metric, parameter, and tag
are expanded into their own columns named metrics., params.*, and tags.**
respectively. For records that don't have a particular metric, parameter, or tag, their
value will be (NumPy) Nan, None, or None respectively.
- list_experiments()¶
Method for listing all the existing experiments (except for those being deleted.)
exps = R.list_experiments()
- 返回类型:
A dictionary (name -> experiment) of experiments information that being stored.
- list_recorders(experiment_id=None, experiment_name=None)¶
Method for listing all the recorders of experiment with given id or name.
If user doesn't provide the id or name of the experiment, this method will try to retrieve the default experiment and list all the recorders of the default experiment. If the default experiment doesn't exist, the method will first create the default experiment, and then create a new recorder under it. (More information about the default experiment can be found here).
Here is the example code:
recorders = R.list_recorders(experiment_name='test')
- 参数:
experiment_id (str) -- id of the experiment.
experiment_name (str) -- name of the experiment.
- 返回类型:
A dictionary (id -> recorder) of recorder information that being stored.
- get_exp(*, experiment_id=None, experiment_name=None, create: bool = True, start: bool = False) Experiment ¶
Method for retrieving an experiment with given id or name. Once the create argument is set to True, if no valid experiment is found, this method will create one for you. Otherwise, it will only retrieve a specific experiment or raise an Error.
If 'create' is True:
If active experiment exists:
no id or name specified, return the active experiment.
if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name.
If active experiment not exists:
no id or name specified, create a default experiment, and the experiment is set to be active.
if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given name or the default experiment.
Else If 'create' is False:
If active experiment exists:
no id or name specified, return the active experiment.
if id or name is specified, return the specified experiment. If no such exp found, raise Error.
If active experiment not exists:
no id or name specified. If the default experiment exists, return it, otherwise, raise Error.
if id or name is specified, return the specified experiment. If no such exp found, raise Error.
Here are some use cases:
# Case 1 with R.start('test'): exp = R.get_exp() recorders = exp.list_recorders() # Case 2 with R.start('test'): exp = R.get_exp(experiment_name='test1') # Case 3 exp = R.get_exp() -> a default experiment. # Case 4 exp = R.get_exp(experiment_name='test') # Case 5 exp = R.get_exp(create=False) -> the default experiment if exists.
- 参数:
experiment_id (str) -- id of the experiment.
experiment_name (str) -- name of the experiment.
create (boolean) -- an argument determines whether the method will automatically create a new experiment according to user's specification if the experiment hasn't been created before.
start (bool) -- when start is True, if the experiment has not started(not activated), it will start It is designed for R.log_params to auto start experiments
- 返回类型:
An experiment instance with given id or name.
- delete_exp(experiment_id=None, experiment_name=None)¶
Method for deleting the experiment with given id or name. At least one of id or name must be given, otherwise, error will occur.
Here is the example code:
R.delete_exp(experiment_name='test')
- 参数:
experiment_id (str) -- id of the experiment.
experiment_name (str) -- name of the experiment.
- get_uri()¶
Method for retrieving the uri of current experiment manager.
Here is the example code:
uri = R.get_uri()
- 返回类型:
The uri of current experiment manager.
- set_uri(uri: str | None)¶
Method to reset the default uri of current experiment manager.
NOTE:
When the uri is refer to a file path, please using the absolute path instead of strings like "~/mlruns/" The backend don't support strings like this.
- uri_context(uri: str)¶
Temporarily set the exp_manager's default_uri to uri
NOTE: - Please refer to the NOTE in the set_uri
- 参数:
uri (Text) -- the temporal uri
- get_recorder(*, recorder_id=None, recorder_name=None, experiment_id=None, experiment_name=None) Recorder ¶
Method for retrieving a recorder.
If active recorder exists:
no id or name specified, return the active recorder.
if id or name is specified, return the specified recorder.
If active recorder not exists:
no id or name specified, raise Error.
if id or name is specified, and the corresponding experiment_name must be given, return the specified recorder. Otherwise, raise Error.
The recorder can be used for further process such as save_object, load_object, log_params, log_metrics, etc.
Here are some use cases:
# Case 1 with R.start(experiment_name='test'): recorder = R.get_recorder() # Case 2 with R.start(experiment_name='test'): recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d') # Case 3 recorder = R.get_recorder() -> Error # Case 4 recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d') -> Error # Case 5 recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d', experiment_name='test')
Here are some things users may concern - Q: What recorder will it return if multiple recorder meets the query (e.g. query with experiment_name) - A: If mlflow backend is used, then the recorder with the latest start_time will be returned. Because MLflow's search_runs function guarantee it
- 参数:
recorder_id (str) -- id of the recorder.
recorder_name (str) -- name of the recorder.
experiment_name (str) -- name of the experiment.
- 返回类型:
A recorder instance.
- delete_recorder(recorder_id=None, recorder_name=None)¶
Method for deleting the recorders with given id or name. At least one of id or name must be given, otherwise, error will occur.
Here is the example code:
R.delete_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d')
- 参数:
recorder_id (str) -- id of the experiment.
recorder_name (str) -- name of the experiment.
- save_objects(local_path=None, artifact_path=None, **kwargs: Dict[str, Any])¶
Method for saving objects as artifacts in the experiment to the uri. It supports either saving from a local file/directory, or directly saving objects. User can use valid python's keywords arguments to specify the object to be saved as well as its name (name: value).
In summary, this API is designs for saving objects to the experiments management backend path, 1. Qlib provide two methods to specify objects - Passing in the object directly by passing with **kwargs (e.g. R.save_objects(trained_model=model)) - Passing in the local path to the object, i.e. local_path parameter. 2. artifact_path represents the the experiments management backend path
If active recorder exists: it will save the objects through the active recorder.
If active recorder not exists: the system will create a default experiment, and a new recorder and save objects under it.
备注
If one wants to save objects with a specific recorder. It is recommended to first get the specific recorder through get_recorder API and use the recorder the save objects. The supported arguments are the same as this method.
Here are some use cases:
# Case 1 with R.start(experiment_name='test'): pred = model.predict(dataset) R.save_objects(**{"pred.pkl": pred}, artifact_path='prediction') rid = R.get_recorder().id ... R.get_recorder(recorder_id=rid).load_object("prediction/pred.pkl") # after saving objects, you can load the previous object with this api # Case 2 with R.start(experiment_name='test'): R.save_objects(local_path='results/pred.pkl', artifact_path="prediction") rid = R.get_recorder().id ... R.get_recorder(recorder_id=rid).load_object("prediction/pred.pkl") # after saving objects, you can load the previous object with this api
- 参数:
local_path (str) -- if provided, them save the file or directory to the artifact URI.
artifact_path (str) -- the relative path for the artifact to be stored in the URI.
**kwargs (Dict[Text, Any]) -- the object to be saved. For example, {"pred.pkl": pred}
- load_object(name: str)¶
Method for loading an object from artifacts in the experiment in the uri.
- log_params(**kwargs)¶
Method for logging parameters during an experiment. In addition to using
R
, one can also log to a specific recorder after getting it with get_recorder API.If active recorder exists: it will log parameters through the active recorder.
If active recorder not exists: the system will create a default experiment as well as a new recorder, and log parameters under it.
Here are some use cases:
# Case 1 with R.start('test'): R.log_params(learning_rate=0.01) # Case 2 R.log_params(learning_rate=0.01)
- 参数:
argument (keyword) -- name1=value1, name2=value2, ...
- log_metrics(step=None, **kwargs)¶
Method for logging metrics during an experiment. In addition to using
R
, one can also log to a specific recorder after getting it with get_recorder API.If active recorder exists: it will log metrics through the active recorder.
If active recorder not exists: the system will create a default experiment as well as a new recorder, and log metrics under it.
Here are some use cases:
# Case 1 with R.start('test'): R.log_metrics(train_loss=0.33, step=1) # Case 2 R.log_metrics(train_loss=0.33, step=1)
- 参数:
argument (keyword) -- name1=value1, name2=value2, ...
- log_artifact(local_path: str, artifact_path: str | None = None)¶
Log a local file or directory as an artifact of the currently active run
If active recorder exists: it will set tags through the active recorder.
If active recorder not exists: the system will create a default experiment as well as a new recorder, and set the tags under it.
- 参数:
local_path (str) -- Path to the file to write.
artifact_path (Optional[str]) -- If provided, the directory in
artifact_uri
to write to.
- download_artifact(path: str, dst_path: str | None = None) str ¶
Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.
- 参数:
path (str) -- Relative source path to the desired artifact.
dst_path (Optional[str]) -- Absolute path of the local filesystem destination directory to which to download the specified artifacts. This directory must already exist. If unspecified, the artifacts will either be downloaded to a new uniquely-named directory on the local filesystem.
- 返回:
Local path of desired artifact.
- 返回类型:
str
- set_tags(**kwargs)¶
Method for setting tags for a recorder. In addition to using
R
, one can also set the tag to a specific recorder after getting it with get_recorder API.If active recorder exists: it will set tags through the active recorder.
If active recorder not exists: the system will create a default experiment as well as a new recorder, and set the tags under it.
Here are some use cases:
# Case 1 with R.start('test'): R.set_tags(release_version="2.2.0") # Case 2 R.set_tags(release_version="2.2.0")
- 参数:
argument (keyword) -- name1=value1, name2=value2, ...
Experiment Manager¶
Qlib
中的 ExpManager
模块负责管理不同的实验。ExpManager
的大部分 API 与 QlibRecorder
类似,其中最重要的 API 是 get_exp
方法。用户可以直接参考上述文档了解如何使用 get_exp
方法的详细信息。
- class qlib.workflow.expm.ExpManager(uri: str, default_exp_name: str | None)
This is the ExpManager class for managing experiments. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
The ExpManager is expected to be a singleton (btw, we can have multiple Experiment`s with different uri. user can get different experiments from different uri, and then compare records of them). Global Config (i.e. `C) is also a singleton.
So we try to align them together. They share the same variable, which is called default uri. Please refer to ExpManager.default_uri for details of variable sharing.
When the user starts an experiment, the user may want to set the uri to a specific uri (it will override default uri during this period), and then unset the specific uri and fallback to the default uri. ExpManager._active_exp_uri is that specific uri.
- __init__(uri: str, default_exp_name: str | None)
- start_exp(*, experiment_id: str | None = None, experiment_name: str | None = None, recorder_id: str | None = None, recorder_name: str | None = None, uri: str | None = None, resume: bool = False, **kwargs) Experiment
Start an experiment. This method includes first get_or_create an experiment, and then set it to be active.
Maintaining _active_exp_uri is included in start_exp, remaining implementation should be included in _end_exp in subclass
- 参数:
experiment_id (str) -- id of the active experiment.
experiment_name (str) -- name of the active experiment.
recorder_id (str) -- id of the recorder to be started.
recorder_name (str) -- name of the recorder to be started.
uri (str) -- the current tracking URI.
resume (boolean) -- whether to resume the experiment and recorder.
- 返回类型:
An active experiment.
- end_exp(recorder_status: str = 'SCHEDULED', **kwargs)
End an active experiment.
Maintaining _active_exp_uri is included in end_exp, remaining implementation should be included in _end_exp in subclass
- 参数:
experiment_name (str) -- name of the active experiment.
recorder_status (str) -- the status of the active recorder of the experiment.
- create_exp(experiment_name: str | None = None)
Create an experiment.
- 参数:
experiment_name (str) -- the experiment name, which must be unique.
- 返回类型:
An experiment object.
- 抛出:
ExpAlreadyExistError --
- search_records(experiment_ids=None, **kwargs)
Get a pandas DataFrame of records that fit the search criteria of the experiment. Inputs are the search criteria user want to apply.
- 返回:
A pandas.DataFrame of records, where each metric, parameter, and tag
are expanded into their own columns named metrics., params.*, and tags.**
respectively. For records that don't have a particular metric, parameter, or tag, their
value will be (NumPy) Nan, None, or None respectively.
- get_exp(*, experiment_id=None, experiment_name=None, create: bool = True, start: bool = False)
Retrieve an experiment. This method includes getting an active experiment, and get_or_create a specific experiment.
When user specify experiment id and name, the method will try to return the specific experiment. When user does not provide recorder id or name, the method will try to return the current active experiment. The create argument determines whether the method will automatically create a new experiment according to user's specification if the experiment hasn't been created before.
If create is True:
If active experiment exists:
no id or name specified, return the active experiment.
if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name. If start is set to be True, the experiment is set to be active.
If active experiment not exists:
no id or name specified, create a default experiment.
if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name. If start is set to be True, the experiment is set to be active.
Else If create is False:
If active experiment exists:
no id or name specified, return the active experiment.
if id or name is specified, return the specified experiment. If no such exp found, raise Error.
If active experiment not exists:
no id or name specified. If the default experiment exists, return it, otherwise, raise Error.
if id or name is specified, return the specified experiment. If no such exp found, raise Error.
- 参数:
experiment_id (str) -- id of the experiment to return.
experiment_name (str) -- name of the experiment to return.
create (boolean) -- create the experiment it if hasn't been created before.
start (boolean) -- start the new experiment if one is created.
- 返回类型:
An experiment object.
- delete_exp(experiment_id=None, experiment_name=None)
Delete an experiment.
- 参数:
experiment_id (str) -- the experiment id.
experiment_name (str) -- the experiment name.
- property default_uri
Get the default tracking URI from qlib.config.C
- property uri
Get the default tracking URI or current URI.
- 返回类型:
The tracking URI string.
- list_experiments()
List all the existing experiments.
- 返回类型:
A dictionary (name -> experiment) of experiments information that being stored.
其他接口如 create_exp、delete_exp 的使用方法,请参阅 实验管理器 API。
实验¶
Experiment
类专门负责单个实验,它将处理与实验相关的所有操作。包含诸如 start、end 等基础实验操作方法。此外,还提供了与 recorders 相关的方法:例如 get_recorder 和 list_recorders。
- class qlib.workflow.exp.Experiment(id, name)
This is the Experiment class for each experiment being run. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
- __init__(id, name)
- start(*, recorder_id=None, recorder_name=None, resume=False)
Start the experiment and set it to be active. This method will also start a new recorder.
- 参数:
recorder_id (str) -- the id of the recorder to be created.
recorder_name (str) -- the name of the recorder to be created.
resume (bool) -- whether to resume the first recorder
- 返回类型:
An active recorder.
- end(recorder_status='SCHEDULED')
End the experiment.
- 参数:
recorder_status (str) -- the status the recorder to be set with when ending (SCHEDULED, RUNNING, FINISHED, FAILED).
- create_recorder(recorder_name=None)
Create a recorder for each experiment.
- 参数:
recorder_name (str) -- the name of the recorder to be created.
- 返回类型:
A recorder object.
- search_records(**kwargs)
Get a pandas DataFrame of records that fit the search criteria of the experiment. Inputs are the search criteria user want to apply.
- 返回:
A pandas.DataFrame of records, where each metric, parameter, and tag
are expanded into their own columns named metrics., params.*, and tags.**
respectively. For records that don't have a particular metric, parameter, or tag, their
value will be (NumPy) Nan, None, or None respectively.
- delete_recorder(recorder_id)
Create a recorder for each experiment.
- 参数:
recorder_id (str) -- the id of the recorder to be deleted.
- get_recorder(recorder_id=None, recorder_name=None, create: bool = True, start: bool = False) Recorder
Retrieve a Recorder for user. When user specify recorder id and name, the method will try to return the specific recorder. When user does not provide recorder id or name, the method will try to return the current active recorder. The create argument determines whether the method will automatically create a new recorder according to user's specification if the recorder hasn't been created before.
If create is True:
If active recorder exists:
no id or name specified, return the active recorder.
if id or name is specified, return the specified recorder. If no such exp found, create a new recorder with given id or name. If start is set to be True, the recorder is set to be active.
If active recorder not exists:
no id or name specified, create a new recorder.
if id or name is specified, return the specified experiment. If no such exp found, create a new recorder with given id or name. If start is set to be True, the recorder is set to be active.
Else If create is False:
If active recorder exists:
no id or name specified, return the active recorder.
if id or name is specified, return the specified recorder. If no such exp found, raise Error.
If active recorder not exists:
no id or name specified, raise Error.
if id or name is specified, return the specified recorder. If no such exp found, raise Error.
- 参数:
recorder_id (str) -- the id of the recorder to be deleted.
recorder_name (str) -- the name of the recorder to be deleted.
create (boolean) -- create the recorder if it hasn't been created before.
start (boolean) -- start the new recorder if one is created.
- 返回类型:
A recorder object.
- list_recorders(rtype: Literal['dict', 'list'] = 'dict', **flt_kwargs) List[Recorder] | Dict[str, Recorder]
List all the existing recorders of this experiment. Please first get the experiment instance before calling this method. If user want to use the method R.list_recorders(), please refer to the related API document in QlibRecorder.
- flt_kwargsdict
filter recorders by conditions e.g. list_recorders(status=Recorder.STATUS_FI)
- 返回:
- if rtype == "dict":
A dictionary (id -> recorder) of recorder information that being stored.
- elif rtype == "list":
A list of Recorder.
- 返回类型:
The return type depends on rtype
其他接口如 search_records、delete_recorder,请参阅 Experiment API。
Qlib
还提供了一个默认的 Experiment
,当用户使用 log_metrics 或 get_exp 等 API 时,会在特定情况下自动创建并使用该默认实验。若使用默认 Experiment
,运行 Qlib
时会有相关日志信息记录。用户可通过修改 Qlib
配置文件或在 Qlib
的 initialization 过程中更改默认实验名称,其默认值为 'Experiment'。
记录器¶
Recorder
类负责单个记录器的管理。它将处理诸如 log_metrics
、log_params
等单次运行的详细操作,旨在帮助用户轻松追踪运行期间产生的结果和内容。
以下是 QlibRecorder
中未包含的一些重要API:
- class qlib.workflow.recorder.Recorder(experiment_id, name)
This is the Recorder class for logging the experiments. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
The status of the recorder can be SCHEDULED, RUNNING, FINISHED, FAILED.
- __init__(experiment_id, name)
- save_objects(local_path=None, artifact_path=None, **kwargs)
Save objects such as prediction file or model checkpoints to the artifact URI. User can save object through keywords arguments (name:value).
Please refer to the docs of qlib.workflow:R.save_objects
- 参数:
local_path (str) -- if provided, them save the file or directory to the artifact URI.
artifact_path=None (str) -- the relative path for the artifact to be stored in the URI.
- load_object(name)
Load objects such as prediction file or model checkpoints.
- 参数:
name (str) -- name of the file to be loaded.
- 返回类型:
The saved object.
- start_run()
Start running or resuming the Recorder. The return value can be used as a context manager within a with block; otherwise, you must call end_run() to terminate the current run. (See ActiveRun class in mlflow)
- 返回类型:
An active running object (e.g. mlflow.ActiveRun object).
- end_run()
End an active Recorder.
- log_params(**kwargs)
Log a batch of params for the current run.
- 参数:
arguments (keyword) -- key, value pair to be logged as parameters.
- log_metrics(step=None, **kwargs)
Log multiple metrics for the current run.
- 参数:
arguments (keyword) -- key, value pair to be logged as metrics.
- log_artifact(local_path: str, artifact_path: str | None = None)
Log a local file or directory as an artifact of the currently active run.
- 参数:
local_path (str) -- Path to the file to write.
artifact_path (Optional[str]) -- If provided, the directory in
artifact_uri
to write to.
- set_tags(**kwargs)
Log a batch of tags for the current run.
- 参数:
arguments (keyword) -- key, value pair to be logged as tags.
- delete_tags(*keys)
Delete some tags from a run.
- 参数:
keys (series of strs of the keys) -- all the name of the tag to be deleted.
- list_artifacts(artifact_path: str = None)
List all the artifacts of a recorder.
- 参数:
artifact_path (str) -- the relative path for the artifact to be stored in the URI.
- 返回类型:
A list of artifacts information (name, path, etc.) that being stored.
- download_artifact(path: str, dst_path: str | None = None) str
Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.
- 参数:
path (str) -- Relative source path to the desired artifact.
dst_path (Optional[str]) -- Absolute path of the local filesystem destination directory to which to download the specified artifacts. This directory must already exist. If unspecified, the artifacts will either be downloaded to a new uniquely-named directory on the local filesystem.
- 返回:
Local path of desired artifact.
- 返回类型:
str
- list_metrics()
List all the metrics of a recorder.
- 返回类型:
A dictionary of metrics that being stored.
- list_params()
List all the params of a recorder.
- 返回类型:
A dictionary of params that being stored.
- list_tags()
List all the tags of a recorder.
- 返回类型:
A dictionary of tags that being stored.
对于其他接口如 save_objects、load_object,请参阅 Recorder API。
记录模板¶
RecordTemp
类是一个用于生成特定格式实验结果(如IC和回测)的类。我们提供了三种不同的`记录模板`类:
SignalRecord
:该类生成模型的`预测`结果。SigAnaRecord
:该类生成模型的`IC`、ICIR、Rank IC 和 Rank ICIR。
以下是``SigAnaRecord``实现的简单示例,用户若想基于自己的预测和标签计算IC、Rank IC、多空收益,可参考此示例。
from qlib.contrib.eva.alpha import calc_ic, calc_long_short_return
ic, ric = calc_ic(pred.iloc[:, 0], label.iloc[:, 0])
long_short_r, long_avg_r = calc_long_short_return(pred.iloc[:, 0], label.iloc[:, 0])
PortAnaRecord
: This class generates the results of backtest. The detailed information about backtest as well as the available strategy, users can refer to Strategy and Backtest.
以下是``PortAnaRecord``实现的简单示例,用户若想基于自己的预测和标签进行回测,可参考此示例。
from qlib.contrib.strategy.strategy import TopkDropoutStrategy
from qlib.contrib.evaluate import (
backtest as normal_backtest,
risk_analysis,
)
# backtest
STRATEGY_CONFIG = {
"topk": 50,
"n_drop": 5,
}
BACKTEST_CONFIG = {
"limit_threshold": 0.095,
"account": 100000000,
"benchmark": BENCHMARK,
"deal_price": "close",
"open_cost": 0.0005,
"close_cost": 0.0015,
"min_cost": 5,
}
strategy = TopkDropoutStrategy(**STRATEGY_CONFIG)
report_normal, positions_normal = normal_backtest(pred_score, strategy=strategy, **BACKTEST_CONFIG)
# analysis
analysis = dict()
analysis["excess_return_without_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"])
analysis["excess_return_with_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"] - report_normal["cost"])
analysis_df = pd.concat(analysis) # type: pd.DataFrame
print(analysis_df)
有关API的更多信息,请参阅 Record Template API。
已知限制¶
Python对象基于pickle保存,当转储对象和加载对象的环境不同时可能会导致问题。