add proxy driver

This commit is contained in:
Danil Uzlov 2025-03-26 10:42:22 +00:00
parent 9ac8ccf086
commit 05e72fc288
4 changed files with 636 additions and 0 deletions

View File

@ -0,0 +1,108 @@
# Proxy driver compatibility
There are 2 challenges with the proxy driver:
1. Proxy needs dynamic state. CSI spec implies that dynamic state must be external,
which isn't ideal for small deployments, and is incompatible with democratic-csi.
2. Proxy must provide a common set of capabilities for all drivers it represents.
A great discussion of difficulties per storage class state can be found here:
- https://github.com/container-storage-interface/spec/issues/370
## Terminology and structure
"Proxy" is the driver created via `driver: proxy` in the main config.
Other drivers are referred to as "real driver" and "underlying driver".
"Connection" is a way to distinguish real drivers in proxy driver calls
- Connection name is set in storage class parameters
- Connection name is stored in volume handle
- Connection name is used as part of config file path
All config files must be mounted into democratic-csi filesystem.
They can be added, updated and removed dynamically.
## CSI features
Generally most features are supported.
However, some calls will not work:
- `ListVolumes`: storage class context is missing
- - https://github.com/container-storage-interface/spec/issues/461
- `ListSnapshots`: TODO: can be implemented. Would require adding snapshot secret
`NodeGetInfo` works but it brings additional challenges.
Node info is common for all storage classes.
If different drivers need different output in `NodeGetInfo`, they can't coexist.
See [node info support notes](./nodeInfo.md)
## Driver compatibility
Proxy driver has the following minimal requirements for real underlying drivers:
- Node methods should not use config values
- - This can be lifted
- - This is added because drivers use only volume context for mounting, and sometimes secrets
- - - There is one exception to this rule, and I would argue that that driver is just broken
- Driver should not need any exotic capabilities, since capabilities are shared
- Driver should use `CreateVolume`, so that proxy can set proper `volume_id`
- Controller publishing is not supported, see [Controller publish support](#controller-publish-support)
Proxy advertises that it supports most CSI methods.
If some methods are missing from underlying driver,
proxy will throw `INVALID_ARGUMENT` error.
Some methods are expected to be missing from some of the underlying drivers. In such cases proxy returns default value:
- `GetCapacity` returns infinite capacity when underlying driver does not report capacity
## Volume ID format
- `volume_id` format: `v:connection-name/original-handle`
- `snapshot_id` format: `s:connection-name/original-handle`
Where:
- `v`, `s` - fixed prefix
- - Allows to check that volume ID was created using proxy driver
- `connection-name` - identifies connection for all CSI calls
- `original-handle` - `volume_id` handle created by the underlying driver
## Controller publish support
`ControllerPublishVolume` is not implemented because currently no driver need this.
Implementation would need to replace `node_id` just like other methods replace `volume_id`.
See [node info support notes](./nodeInfo.md)
## Incompatible drivers
- `zfs-local-ephemeral-inline`: proxy can't set volume_id in `CreateVolume` to identify underlying driver
- - are inline-ephemeral and standard drivers even compatible?
- `objectivefs`: `NodeStageVolume` uses driver parameters
- - `NodeStageVolume` needs `this.options` in `getDefaultObjectiveFSInstance`
- - Other node methods don't need driver options
- - Possible fix: add support for config values for node methods
- - Possible fix: add public pool data into volume attributes, move private data (if any) into a secret
## Volume cloning and snapshots
Cloning works without any adjustments when both volumes use the same connection.
If the connection is different:
- TODO: Same driver, same server
- - It's up to driver to add support
- - Support is easy: just need to get proper source location in the CreateVolume
- TODO: Same driver, different servers
- - It's up to driver to add support
- - Example: zfs send-receive
- - Example: file copy between nfs servers
- Different drivers: block <-> file: unlikely to be practical
- - Users should probably do such things manually, by mounting both volumes into a pod
- Different drivers: same filesystem type
- - Drivers should implement generic export and import functions
- - For example: TrueNas -> generic-zfs can theoretically be possible via zfs send
- - For example: nfs -> nfs can theoretically be possible via file copy
- - How to coordinate different drivers?

View File

@ -0,0 +1,404 @@
const _ = require("lodash");
const semver = require("semver");
const { CsiBaseDriver } = require("../index");
const yaml = require("js-yaml");
const fs = require('fs');
const { Registry } = require("../../utils/registry");
const { GrpcError, grpc } = require("../../utils/grpc");
const path = require('path');
const volumeIdPrefix = 'v:';
const snapshotIdPrefix = 's:';
const NODE_TOPOLOGY_KEY_NAME = "org.democratic-csi.topology/node";
class CsiProxyDriver extends CsiBaseDriver {
constructor(ctx, options) {
super(...arguments);
this.options.proxy.configFolder = path.normalize(this.options.proxy.configFolder);
if (this.options.proxy.configFolder.slice(-1) == '/') {
this.options.proxy.configFolder = this.options.proxy.configFolder.slice(0, -1);
}
// corresponding storage class could be deleted without notice
// let's delete entry from cache after 1 hour, so it can be cleaned by GC
// one hour seems long enough to avoid recreating frequently used drivers
// creating a new instance after long inactive period shouldn't be a problem
const oneMinuteInMs = 1000 * 60;
this.enableCacheTimeout = this.options.proxy.cacheTimeoutMinutes != -1;
this.cacheTimeout = (this.options.proxy.cacheTimeoutMinutes ?? 60) * oneMinuteInMs;
if (!this.enableCacheTimeout) {
this.ctx.logger.info("driver cache is permanent");
} else {
this.ctx.logger.info(`driver cache timeout is ${this.options.proxy.cacheTimeoutMinutes} minutes`);
}
options = options || {};
options.service = options.service || {};
options.service.identity = options.service.identity || {};
options.service.controller = options.service.controller || {};
options.service.node = options.service.node || {};
options.service.identity.capabilities =
options.service.identity.capabilities || {};
options.service.controller.capabilities =
options.service.controller.capabilities || {};
options.service.node.capabilities = options.service.node.capabilities || {};
if (!("service" in options.service.identity.capabilities)) {
this.ctx.logger.debug("setting default identity service caps");
options.service.identity.capabilities.service = [
//"UNKNOWN",
"CONTROLLER_SERVICE",
"VOLUME_ACCESSIBILITY_CONSTRAINTS",
];
}
if (!("volume_expansion" in options.service.identity.capabilities)) {
this.ctx.logger.debug("setting default identity volume_expansion caps");
options.service.identity.capabilities.volume_expansion = [
//"UNKNOWN",
"ONLINE",
//"OFFLINE"
];
}
if (!("rpc" in options.service.controller.capabilities)) {
this.ctx.logger.debug("setting default controller caps");
options.service.controller.capabilities.rpc = [
//"UNKNOWN",
"CREATE_DELETE_VOLUME",
//"PUBLISH_UNPUBLISH_VOLUME",
//"LIST_VOLUMES_PUBLISHED_NODES",
// "LIST_VOLUMES",
"GET_CAPACITY",
"CREATE_DELETE_SNAPSHOT",
// "LIST_SNAPSHOTS",
"CLONE_VOLUME",
//"PUBLISH_READONLY",
"EXPAND_VOLUME",
];
if (semver.satisfies(this.ctx.csiVersion, ">=1.3.0")) {
options.service.controller.capabilities.rpc.push(
//"VOLUME_CONDITION",
// "GET_VOLUME"
);
}
if (semver.satisfies(this.ctx.csiVersion, ">=1.5.0")) {
options.service.controller.capabilities.rpc.push(
"SINGLE_NODE_MULTI_WRITER"
);
}
}
if (!("rpc" in options.service.node.capabilities)) {
this.ctx.logger.debug("setting default node caps");
options.service.node.capabilities.rpc = [
//"UNKNOWN",
"STAGE_UNSTAGE_VOLUME",
"GET_VOLUME_STATS",
"EXPAND_VOLUME",
//"VOLUME_CONDITION",
];
if (semver.satisfies(this.ctx.csiVersion, ">=1.3.0")) {
//options.service.node.capabilities.rpc.push("VOLUME_CONDITION");
}
if (semver.satisfies(this.ctx.csiVersion, ">=1.5.0")) {
options.service.node.capabilities.rpc.push("SINGLE_NODE_MULTI_WRITER");
/**
* This is for volumes that support a mount time gid such as smb or fat
*/
//options.service.node.capabilities.rpc.push("VOLUME_MOUNT_GROUP"); // in k8s is sent in as the security context fsgroup
}
}
}
parseVolumeHandle(handle, prefix = volumeIdPrefix) {
if (!handle.startsWith(prefix)) {
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`invalid volume handle: ${handle}: expected prefix ${prefix}`
);
}
handle = handle.substring(prefix.length);
return {
connectionName: handle.substring(0, handle.indexOf('/')),
realHandle: handle.substring(handle.indexOf('/') + 1),
};
}
decorateVolumeHandle(connectionName, handle, prefix = volumeIdPrefix) {
return prefix + connectionName + '/' + handle;
}
// returns real driver object
// internally drivers are cached and deleted on timeout
lookUpConnection(connectionName) {
const configFolder = this.options.proxy.configFolder;
const configPath = configFolder + '/' + connectionName + '.yaml';
if (this.timeout == 0) {
// when timeout is 0, force creating a new driver on each request
return this.createDriverFromFile(configPath);
}
const driverPlaceholder = {
connectionName: connectionName,
fileTime: 0,
driver: null,
};
const cachedDriver = this.ctx.registry.get(`controller:driver/connection=${connectionName}`, driverPlaceholder);
if (cachedDriver.timer !== null) {
clearTimeout(cachedDriver.timer);
cachedDriver.timer = null;
}
if (this.enableCacheTimeout) {
cachedDriver.timer = setTimeout(() => {
this.ctx.logger.info("removing inactive connection: %s", connectionName);
this.ctx.registry.delete(`controller:driver/connection=${connectionName}`);
cachedDriver.timer = null;
}, this.timeout);
}
const fileTime = this.getFileTime(configPath);
if (cachedDriver.fileTime != fileTime) {
this.ctx.logger.debug("connection version is old: file time %d != %d", cachedDriver.fileTime, fileTime);
cachedDriver.fileTime = fileTime;
this.ctx.logger.info("creating a new connection: %s", connectionName);
cachedDriver.driver = this.createDriverFromFile(configPath);
}
return cachedDriver.driver;
}
getFileTime(path) {
try {
const configFileStats = fs.statSync(path);
this.ctx.logger.debug("file time for '%s' is: %d", path, configFileStats.mtime);
return configFileStats.mtime.getTime();
} catch (e) {
this.ctx.logger.error("fs.statSync failed: %s", e.toString());
throw e;
}
}
createDriverFromFile(configPath) {
const fileOptions = this.createOptionsFromFile(configPath);
const mergedOptions = structuredClone(this.options);
_.merge(mergedOptions, fileOptions);
return this.createRealDriver(mergedOptions);
}
createOptionsFromFile(configPath) {
this.ctx.logger.debug("loading config: %s", configPath);
try {
return yaml.load(fs.readFileSync(configPath, "utf8"));
} catch (e) {
this.ctx.logger.error("failed parsing config file: %s", e.toString());
throw e;
}
}
validateDriverType(driver) {
const unsupportedDrivers = [
"zfs-local-",
"local-hostpath",
"objectivefs",
"proxy",
];
for (const prefix in unsupportedDrivers) {
if (driver.startsWith(prefix)) {
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`proxy is not supported for driver: ${mergedOptions.driver}`
);
}
}
}
createRealDriver(options) {
this.validateDriverType(options.driver);
const realContext = Object.assign({}, this.ctx);
realContext.registry = new Registry();
const realDriver = this.ctx.factory(realContext, options);
if (realDriver.constructor.name == this.constructor.name) {
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`cyclic dependency: proxy on proxy`
);
}
this.ctx.logger.debug("using driver %s", realDriver.constructor.name);
return realDriver;
}
async checkAndRun(driver, methodName, call, defaultValue) {
if(typeof driver[methodName] !== 'function') {
if (defaultValue) return defaultValue;
// UNIMPLEMENTED could possibly confuse CSI CO into thinking
// that driver does not support methodName at all.
// INVALID_ARGUMENT should allow CO to use methodName with other storage classes.
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`underlying driver does not support ` + methodName
);
}
return await driver[methodName](call);
}
async controllerRunWrapper(methodName, call, defaultValue) {
const volumeHandle = this.parseVolumeHandle(call.request.volume_id);
const driver = this.lookUpConnection(volumeHandle.connectionName);
call.request.volume_id = volumeHandle.realHandle;
return await this.checkAndRun(driver, methodName, call, defaultValue);
}
// ===========================================
// Controller methods below
// ===========================================
async GetCapacity(call) {
const parameters = call.request.parameters;
if (!parameters.connection) {
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`connection missing from parameters`
);
}
const connectionName = parameters.connection;
const driver = this.lookUpConnection(connectionName);
return await this.checkAndRun(driver, 'GetCapacity', call, {
available_capacity: Number.MAX_SAFE_INTEGER,
});
}
async CreateVolume(call) {
const parameters = call.request.parameters;
if (!parameters.connection) {
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`connection missing from parameters`
);
}
const connectionName = parameters.connection;
const driver = this.lookUpConnection(connectionName);
switch (call.request.volume_content_source?.type) {
case "snapshot": {
const snapshotHandle = this.parseVolumeHandle(call.request.volume_content_source.snapshot.snapshot_id, snapshotIdPrefix);
if (snapshotHandle.connectionName != connectionName) {
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`can not inflate snapshot from a different connection`
);
}
call.request.volume_content_source.snapshot.snapshot_id = snapshotHandle.realHandle;
break;
}
case "volume": {
const volumeHandle = this.parseVolumeHandle(call.request.volume_content_source.volume.volume_id);
if (volumeHandle.connectionName != connectionName) {
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`can not clone volume from a different connection`
);
}
call.request.volume_content_source.volume.volume_id = volumeHandle.realHandle;
break;
}
case undefined:
case null:
break;
default:
throw new GrpcError(
grpc.status.INVALID_ARGUMENT,
`unknown volume_content_source type: ${call.request.volume_content_source.type}`
);
}
const result = await this.checkAndRun(driver, 'CreateVolume', call);
this.ctx.logger.debug("CreateVolume result " + result);
result.volume.volume_id = this.decorateVolumeHandle(connectionName, result.volume.volume_id);
return result;
}
async DeleteVolume(call) {
return await this.controllerRunWrapper('DeleteVolume', call);
}
async ControllerGetVolume(call) {
return await this.controllerRunWrapper('ControllerGetVolume', call);
}
async ControllerExpandVolume(call) {
return await this.controllerRunWrapper('ControllerExpandVolume', call);
}
async CreateSnapshot(call) {
const volumeHandle = this.parseVolumeHandle(call.request.source_volume_id);
const driver = this.lookUpConnection(volumeHandle.connectionName);
call.request.source_volume_id = volumeHandle.realHandle;
const result = await this.checkAndRun(driver, 'CreateSnapshot', call);
result.snapshot.source_volume_id = this.decorateVolumeHandle(connectionName, result.snapshot.source_volume_id);
result.snapshot.snapshot_id = this.decorateVolumeHandle(connectionName, result.snapshot.snapshot_id, snapshotIdPrefix);
return result;
}
async DeleteSnapshot(call) {
const volumeHandle = this.parseVolumeHandle(call.request.snapshot_id, snapshotIdPrefix);
const driver = this.lookUpConnection(volumeHandle.connectionName);
call.request.snapshot_id = volumeHandle.realHandle;
return await this.checkAndRun(driver, 'DeleteSnapshot', call);
}
async ValidateVolumeCapabilities(call) {
return await this.controllerRunWrapper('ValidateVolumeCapabilities', call);
}
// ===========================================
// Node methods below
// ===========================================
//
// Theoretically, controller setup with config files could be replicated in node deployment,
// and node could create proper drivers for each call.
// But it doesn't seem like node would benefit from this.
// - CsiBaseDriver.NodeStageVolume calls this.assertCapabilities which should be run in the real driver
// but no driver-specific functions or options are used.
// So we can just create an empty driver with default options
// - Other Node* methods don't use anything driver specific
lookUpNodeDriver(call) {
const driverType = call.request.volume_context.provisioner_driver;
return this.ctx.registry.get(`node:driver/${driverType}`, () => {
const driverOptions = structuredClone(this.options);
driverOptions.driver = driverType;
return this.createRealDriver(driverOptions);
});
}
async NodeStageVolume(call) {
const driver = this.lookUpNodeDriver(call);
return await this.checkAndRun(driver, 'NodeStageVolume', call);
}
async NodeGetInfo(call) {
const nodeName = process.env.CSI_NODE_ID || os.hostname();
const result = {
node_id: nodeName,
max_volumes_per_node: 0,
};
result.accessible_topology = {
segments: {
[NODE_TOPOLOGY_KEY_NAME]: nodeName,
},
};
return result;
}
}
module.exports.CsiProxyDriver = CsiProxyDriver;

View File

@ -0,0 +1,120 @@
# Node info
Node info is common for all storage classes.
Proxy driver must report some values that are compatible with all real drivers.
There are 2 important values:
- topology
- node ID
There are only 2 types of topology in democratic-csi:
topology without constraints and node-local volumes.
It's easy to account for with proxy settings.
Node ID is a bit harder to solve, but this page suggests a solution.
Also, currently no real driver actually needs `node_id` to work,
so all of this is mostly a proof-of-concept.
A proof that you can create a functional proxy driver even with current CSI spec.
We can replace `node_id` with fixed value, just like we do with `volume_id` field,
before calling the actual real driver method.
Node ID docs are not a part of user documentation because currently this is very theoretical.
Current implementation works fine but doesn't do anything useful for users.
# Node info: config example
```yaml
# configured in root proxy config
proxy:
nodeId:
parts:
# when value is true, corresponding node info is included into node_id,
# and can be accessed by proxy driver in controller
# it allows you to cut info from node_id to make it shorter
nodeName: true
hostname: false
iqn: false
nqn: false
# prefix allows you to save shorter values into node_id, so it can fit more than one value
# on node prefix is replaced with short name, on controller the reverse [can] happen
nqnPrefix:
- shortName: '1'
prefix: 'nqn.2000-01.com.example.nvmeof:'
- shortName: '2'
prefix: 'nqn.2014-08.org.nvmexpress:uuid:'
iqnPrefix:
- shortName: '1'
prefix: 'iqn.2000-01.com.example:'
nodeTopology:
# 'cluster': all nodes have the same value
# 'node': each node will get its own topology group
type: cluster
```
```yaml
# add to each _real_ driver config
proxy:
perDriver:
# allowed values: nodeName, hostname, iqn, nqn
# proxy will use this to decide how to fill node_id for current driver
nodeIdType: nodeName
```
# Reasoning why such complex node_id is required
`node_name + iqn + nqn` can be very long.
Each of these values can theoretically exceed 200 symbols in length.
It's unreasonable to expect users to always use short values.
But it's reasonable to expect that IQNs and NQNs in the cluster will have only a few patterns.
Many clusters likely only use one pattern with only a short variable suffix.
Even if not all nodes follow the same pattern, the amount of patterns is limited.
Saving short suffix allows you to fit all identifiers into node_id without dynamic state.
Values example:
- node name: `node-name.cluster-name.customer-name.suffix`
- iqn: `iqn.2000-01.com.example:qwerty1234`
- nqn: `nqn.2014-08.org.nvmexpress:uuid:68f1d462-633b-4085-a634-899b88e5af74`
- node_id: `n=node-name.cluster-name.customer-name.suffix/i1=qwerty1234/v2=68f1d462-633b-4085-a634-899b88e5af74`
- - Note: even with kinda long node name and default debian IQN and NQN values this still comfortably fits into `node_id` length limit of 256 chars.
- - Maybe we could add prefix and suffix mechanism for node name if very long node name is an issue in real production clusters.
I'm not too familiar with managed k8s node name practices.
For example, if driver needs iqn, proxy will find field in node id starting with `i`,
search `proxy.nodeId.iqnPrefix` for entry with `shortName = 1`, and then set `node_id` to
`proxy.nodeId.iqnPrefix[name=1].prefix` + `qwerty`
## Alternatives to prefixes
Each driver can override `node_id` based on node name.
Each driver can use template for `node_id` based on node name and/or hostname.
Config example:
```yaml
# add to each _real_ driver config
proxy:
perDriver:
# local means that this driver uses node ID template instead of using values from NodeGetInfo
# Individual nodes can use nodeIdMap instead of template.
# Possibly, even all nodes could use nodeIdMap.
nodeIdType: local
nodeIdMap:
- nodeName: node1
value: nqn.2000-01.com.example:qwerty
- nodeName: node2
value: nqn.2000-01.com.example:node2
nodeIdTemplate: iqn.2000-01.com.example:{{ hostname }}:{{ nodeName }}-suffix
```
The obvious disadvantage is that it requires a lot more configuration from the user.
Still, if this were to be useful for some reason, this is fully compatible with the current `node_id` format in proxy.
Theoretically, more info can be extracted from node to be used in `nodeIdTemplate`,
provided the info is short enough to fit into `node_id` length limit.

View File

@ -15,9 +15,13 @@ const { ControllerLustreClientDriver } = require("./controller-lustre-client");
const { ControllerObjectiveFSDriver } = require("./controller-objectivefs"); const { ControllerObjectiveFSDriver } = require("./controller-objectivefs");
const { ControllerSynologyDriver } = require("./controller-synology"); const { ControllerSynologyDriver } = require("./controller-synology");
const { NodeManualDriver } = require("./node-manual"); const { NodeManualDriver } = require("./node-manual");
const { CsiProxyDriver } = require("./controller-proxy");
function factory(ctx, options) { function factory(ctx, options) {
ctx.factory = factory;
switch (options.driver) { switch (options.driver) {
case "proxy":
return new CsiProxyDriver(ctx, options);
case "freenas-nfs": case "freenas-nfs":
case "freenas-smb": case "freenas-smb":
case "freenas-iscsi": case "freenas-iscsi":