之前有同學遇到一個問題,通過 Workload 配置一個 Serving 服務的時候,通過 model_config_file 這個選項來指定多個模型文件,配置文件大概長這個樣子。
➜ tmp cat model.config
model_config_list {
config {
name:'10062'
base_path:'s3://xxx-ai/humanoid/10062'
model_platform:'tensorflow'
}
config {
name:'10075'
base_path:'s3://xxx-ai/humanoid/10075'
model_platform:'tensorflow'
}
}
但是 Serving 服務進程啓動的時候,報錯了,錯誤信息是說 Could not find base path xxxxxx,意思是沒找到 base path?
其實這裏是因爲配置文件裏的 base path 配置可以發現,最後沒有斜槓 /,在 S3 裏,沒有 / 會被當做是一個對象 object,而 Serving 關於讀取 base path 模型的源碼如下。從源碼可以看到,Serving 會拿到 base path 之後去遍歷這個目錄下面的文件,而如果是 s3 文件的話,這個對象本身是不存在的,所以就會報錯,正確的做法,只要在 base path 參數的最後,補上斜槓 / 即可,如 s3://xxx-ai/humanoid/10075/,而這個問題,當模型在本地文件系統是不存在的。
// Like PollFileSystemForConfig(), but for a single servable.
Status PollFileSystemForServable(
const FileSystemStoragePathSourceConfig::ServableToMonitor& servable,
std::vector<ServableData<StoragePath>>* versions) {
// First, determine whether the base path exists. This check guarantees that
// we don't emit an empty aspired-versions list for a non-existent (or
// transiently unavailable) base-path. (On some platforms, GetChildren()
// returns an empty list instead of erring if the base path isn't found.)
if (!Env::Default()->FileExists(servable.base_path()).ok()) {
return errors::InvalidArgument("Could not find base path ",
servable.base_path(), " for servable ",
servable.servable_name());
}
// Retrieve a list of base-path children from the file system.
std::vector<string> children;
TF_RETURN_IF_ERROR(
Env::Default()->GetChildren(servable.base_path(), &children));
// GetChildren() returns all descendants instead for cloud storage like GCS.
// In such case we should filter out all non-direct descendants.
std::set<string> real_children;
for (int i = 0; i < children.size(); ++i) {
const string& child = children[i];
real_children.insert(child.substr(0, child.find_first_of('/')));
}
children.clear();
children.insert(children.begin(), real_children.begin(), real_children.end());
const std::map<int64 /* version */, string /* child */> children_by_version =
IndexChildrenByVersion(children);
bool at_least_one_version_found = false;
switch (servable.servable_version_policy().policy_choice_case()) {
case FileSystemStoragePathSourceConfig::ServableVersionPolicy::
POLICY_CHOICE_NOT_SET:
TF_FALLTHROUGH_INTENDED; // Default policy is kLatest.
case FileSystemStoragePathSourceConfig::ServableVersionPolicy::kLatest:
at_least_one_version_found =
AspireLatestVersions(servable, children_by_version, versions);
break;
case FileSystemStoragePathSourceConfig::ServableVersionPolicy::kAll:
at_least_one_version_found =
AspireAllVersions(servable, children, versions);
break;
case FileSystemStoragePathSourceConfig::ServableVersionPolicy::kSpecific: {
at_least_one_version_found =
AspireSpecificVersions(servable, children_by_version, versions);
break;
}
default:
return errors::Internal("Unhandled servable version_policy: ",
servable.servable_version_policy().DebugString());
}
if (!at_least_one_version_found) {
LOG(WARNING) << "No versions of servable " << servable.servable_name()
<< " found under base path " << servable.base_path();
}
return Status::OK();
}