缓存可以减少冗余的数据传输,解决网络瓶颈问题,降低服务端压力,提升页面加载速度。高效利用缓存可大幅提升页面加载速度,提升用户的浏览体验。WKWebView 使用缓存技术存储前后端资源,用户提高页面性能和用户体验。因为 WKWebView 的封闭性,我们无法针对原生 WKWebView 做较深度化的定制,但对于 WebKit 缓存源码的探究,将帮助我们更好的使用和理解缓存。本文将延续 [《iOS 端 webkit 源码调试与分析》] 的思路,结合源码枚举 WKWebView 中的各类缓存,并重点讲述其中的 HTTP 协议缓存,帮助读者更好的理解 WebKit 中缓存的设计思路。
// HTTP 磁盘缓存。
WKWebsiteDataTypeDiskCache,
// html离线Web应用程序缓存。
WKWebsiteDataTypeOfflineWebApplicationCache,
// HTTP 内存缓存。
WKWebsiteDataTypeMemoryCache,
// 会话存储:存储对数据只有在同一个会话中的页面才能访问并且当会话结束后数据也随之销毁。
// 因此sessionStorage不是一种持久化的本地存储,仅仅是会话级别的存储
WKWebsiteDataTypeSessionStorage,
// 本地存储:localStorage 类似 sessionStorage,但其区别在于,存储在 localStorage 的数据可以长期保留.
WKWebsiteDataTypeLocalStorage,
// Cookies存储:存储所有的cookie数据
WKWebsiteDataTypeCookies,
// IndexedDB数据库:IndexedDB是WebSQL数据库的取代品。IndexedDB是key-value型数据库,操作简单。
WKWebsiteDataTypeIndexedDBDatabases,
// webSQL数据库:W3C组织在2010年11月18日废弃了webSql 数据库,该数据库接口操组复杂,对用户不友好。
WKWebsiteDataTypeWebSQLDatabases
通过数据分析,主要是 indexedDB 与 NetworkCache 占据较大比例,可达80%以上。WebKit 磁盘缓存分布如下表:
磁盘文件目录 | 缓存类型 |
---|---|
Library/WebKit | IndexedDB LocalStorage MediaKeys ResourceLoadStatistics |
Library/Caches/WebKit | CacheStorage NetworkCache offlineWebApplicationCache ServiceWorkers |
在 WebKit 中,pageCache 其实就是对 WebBackForwardCache – 前进后退缓存的封装,本质上是浏览历史的一种记录,不属于上述标准缓存。前进后退缓存,将整个页面快照存入到内存中,下一次使用的时候,不用进行各类资源加载,甚至不用进行渲染工作。
通过源码查看,pageCache 大小会随着可使用内存大小动态变化:
手机可用内存a | 可缓存page页数 |
---|---|
a >= 512M | 2 |
512M > a >= 256M | 1 |
other | 0 |
缓存策略源码如下所示:
// back/forward cache capacity (in pages)
if (memorySize >= 512)
backForwardCacheCapacity = 2;
else if (memorySize >= 256)
backForwardCacheCapacity = 1;
else
backForwardCacheCapacity = 0;
资源的过期时间默认为30分钟。通过定时器触发任务,30分钟后自动清理过期的 page。源码如下:
static const Seconds expirationDelay { 30_min };
//通过定时器触发,到过期时间后,进行资源清理
void WebBackForwardCacheEntry::expirationTimerFired()
{
RELEASE_LOG(BackForwardCache, "%p - WebBackForwardCacheEntry::expirationTimerFired backForwardItemID=%s, hasSuspendedPage=%d", this, m_backForwardItemID.string().utf8().data(), !!m_suspendedPage);
ASSERT(m_backForwardItemID);
auto* item = WebBackForwardListItem::itemForID(m_backForwardItemID);
ASSERT(item);
m_backForwardCache.removeEntry(*item); // Will destroy |this|.
}
因 pageCache 存储页面数量有限,因此当超出页面缓存上限时,需要通过如下 LRU 算法进行替换:
void BackForwardCache::prune(PruningReason pruningReason)
{
while (pageCount() > maxSize()) {
auto oldestItem = m_items.takeFirst();
oldestItem->setCachedPage(nullptr);
oldestItem->m_pruningReason = pruningReason;
RELEASE_LOG(BackForwardCache, "BackForwardCache::prune removing item: %s, size: %u / %u", oldestItem->identifier().string().utf8().data(), pageCount(), maxSize());
}
}
缓存时机源码如下:
bool WebPageProxy::suspendCurrentPageIfPossible(...) {
...
// If the source and the destination back / forward list items are the same, then this is a client-side redirect. In this case,
// there is no need to suspend the previous page as there will be no way to get back to it.
if (fromItem && fromItem == m_backForwardList->currentItem()) {
RELEASE_LOG_IF_ALLOWED(ProcessSwapping, "suspendCurrentPageIfPossible: Not suspending current page for process pid %i because this is a client-side redirect", m_process->processIdentifier());
return false;
}
...
//创建 SuspendedPageProxy 变量,此时 m_suspendedPageCount 的值会加一
auto suspendedPage = makeUnique<SuspendedPageProxy>(*this, m_process.copyRef(), *mainFrameID, shouldDelayClosingUntilFirstLayerFlush);
m_lastSuspendedPage = makeWeakPtr(*suspendedPage);
...
//添加进历史栈缓存
backForwardCache().addEntry(*fromItem, WTFMove(suspendedPage));
...
}
可以看到,如果 WKWebView 切换页面时,发生 cross-site 且为 client-side redirect 时会清理当前 WebProgressProxy 关联的所有历史栈缓存,后续切换到这些历史栈时都需要重新请求网络。而其他类型都会正常存储,因此可以基于前进后退相关操作的页面性能考虑,可以减少前端重定向,多依赖后端进行重定向功能。
处于内存中的缓存,会随着进程的结束而消亡。而处于磁盘中的缓存,则可以通过如下方法进行手动清理,避免磁盘占用增长过大。
webkit磁盘中的较多数据都是通过域名做为文件名的一部分,因此也可以通过域名、日期等方式匹配,进行文件删除:
NSString *libraryDir = NSSearchPathForDirectoriesInDomains(NSLibraryDirectory,NSUserDomainMask, YES)[0];
NSString *bundleId = [[[NSBundle mainBundle] infoDictionary] objectForKey:@"CFBundleIdentifier"];
NSString *webkitFolderInLib = [NSString stringWithFormat:@"%@/WebKit",libraryDir];
NSString *webKitFolderInCaches = [NSString stringWithFormat:@"%@/Caches/%@/WebKit",libraryDir,bundleId];
NSError *error;
[[NSFileManager defaultManager] removeItemAtPath:webKitFolderInCaches error:&error];
[[NSFileManager defaultManager] removeItemAtPath:webkitFolderInLib error:nil];
localStorage 存储文件样例
iOS 9.0以后 , WebKit 清除缓存的API,测试来看必须在主线程进行操作。
NSSet *websiteDataTypes = [NSSet setWithArray:@[
WKWebsiteDataTypeDiskCache,
WKWebsiteDataTypeOfflineWebApplicationCache,
WKWebsiteDataTypeLocalStorage,
WKWebsiteDataTypeCookies,
WKWebsiteDataTypeSessionStorage,
WKWebsiteDataTypeIndexedDBDatabases,
WKWebsiteDataTypeWebSQLDatabases
]];
NSDate *dateFrom = [NSDate dateWithTimeIntervalSince1970:0];
//dataTypes: 指定删除的网站数据类型,date: 在此日期之后修改的所有网站数据将被删除,completionHandler: 当网站数据被删除时调用的block。
[[WKWebsiteDataStore defaultDataStore] removeDataOfTypes:websiteDataTypes modifiedSince:dateFrom completionHandler:^{
// 结束回调
}];
WKWebView 与 app 处于不同进程中,且内存与磁盘缓存也在不同进程中,其中,内存缓存位于 WebContentProcess 进程中,而磁盘缓存位于 NetworkProcess 进程中。且每个memoryCache 对应一个 webContent 进程,如图所示。
如上图所示,一个页面对应一个WebContentProcess 进程,当页面销毁时,其对应的内存缓存也被销毁。
虽然 WebKit 进程独立与 app 进程,但内存占用过大依旧会影响到 app 进程的性能,因此内存缓存根据手机当前缓存大小进行分配。
手机可用内存a | 页面内存分配 |
---|---|
a >= 2G | 128M |
2G > a >= 1.5G | 96M |
1.5G > a >= 1G | 64M |
1G > a >= 0.5G | 32M |
other | 16M |
缓存大小计算策略源码如下:
case CacheModel::PrimaryWebBrowser: {
// back/forward cache capacity (in pages)
if (memorySize >= 512)
backForwardCacheCapacity = 2;
else if (memorySize >= 256)
backForwardCacheCapacity = 1;
else
backForwardCacheCapacity = 0;
// Object cache capacities (in bytes)
// (Testing indicates that value / MB depends heavily on content and
// browsing pattern. Even growth above 128MB can have substantial
// value / MB for some content / browsing patterns.)
if (memorySize >= 2048)
cacheTotalCapacity = 128 * MB;
else if (memorySize >= 1536)
cacheTotalCapacity = 96 * MB;
else if (memorySize >= 1024)
cacheTotalCapacity = 64 * MB;
else if (memorySize >= 512)
cacheTotalCapacity = 32 * MB;
else
cacheTotalCapacity = 16 * MB;
cacheMinDeadCapacity = cacheTotalCapacity / 4;
cacheMaxDeadCapacity = cacheTotalCapacity / 2;
// This code is here to avoid a PLT regression. We can remove it if we
// can prove that the overall system gain would justify the regression.
cacheMaxDeadCapacity = std::max(24u, cacheMaxDeadCapacity);
deadDecodedDataDeletionInterval = 60_s;
break;
}
使用 map 字典,在内存中使用 url 为 key,resource 资源为 value,对当前页面的所有 HTTP 网络请求资源进行存储。
bool MemoryCache::add(CachedResource& resource)
{
if (disabled())
return false;
if (resource.resourceRequest().httpMethod() != "GET")
return false;
ASSERT(WTF::isMainThread());
auto key = std::make_pair(resource.url(), resource.cachePartition());
ensureSessionResourceMap(resource.sessionID()).set(key, &resource);
resource.setInCache(true);
resourceAccessed(resource);
LOG(ResourceLoading, "MemoryCache::add Added '%.255s', resource %p\n", resource.url().string().latin1().data(), &resource);
return true;
}
CachedResource* MemoryCache::resourceForRequest(const ResourceRequest& request, PAL::SessionID sessionID)
{
// FIXME: Change all clients to make sure HTTP(s) URLs have no fragment identifiers before calling here.
// CachedResourceLoader is now doing this. Add an assertion once all other clients are doing it too.
auto* resources = sessionResourceMap(sessionID);
if (!resources)
return nullptr;
return resourceForRequestImpl(request, *resources);
}
CachedResource* MemoryCache::resourceForRequestImpl(const ResourceRequest& request, CachedResourceMap& resources)
{
ASSERT(WTF::isMainThread());
URL url = removeFragmentIdentifierIfNeeded(request.url());
auto key = std::make_pair(url, request.cachePartition());
return resources.get(key);
}
HTTP 内存缓存读取时机不同于磁盘缓存,它并不完全遵守 HTTP 标准协议,而是根据浏览器所加载的资源策略来进行的。例如:
// 网络请求加载是否使用内存缓存有如下策略:
enum RevalidationPolicy {
Use, // 直接使用
Revalidate, // 需要经过 HTTP 缓存协议校验
Reload, // 重新加载,清理内存缓存,并重新请求
Load // 直接从网络加载
};
RevalidationPolicy policy = determineRevalidationPolicy(type, request, resource.get(), forPreload, imageLoading);
磁盘缓存的设计完全遵循 HTTP 标准缓存协议。所有的网络请求都经过 NetWorkProcess 进程发出,请求在发出之前,则会经过缓存协议检验,根据 HTTP 协议进行相应操作(读取缓存/协商检验/不使用缓存等)。当服务端返回请求内容后,NetworkProcess 模块也会做出对应的判断,决定内容是否进行缓存或更新,如下所示。
磁盘缓存存入到指定的文件目录中,其中默认为:Library/Caches/WebKit/NetworkCache。可以通过如下方法进行指定:
case CacheModel::PrimaryWebBrowser: {
// Disk cache capacity (in bytes)
if (diskFreeSize >= 16384)
urlCacheDiskCapacity = 1 * GB;
else if (diskFreeSize >= 8192)
urlCacheDiskCapacity = 500 * MB;
else if (diskFreeSize >= 4096)
urlCacheDiskCapacity = 250 * MB;
else if (diskFreeSize >= 2048)
urlCacheDiskCapacity = 200 * MB;
else if (diskFreeSize >= 1024)
urlCacheDiskCapacity = 150 * MB;
else
urlCacheDiskCapacity = 100 * MB;
break;
}
default:
ASSERT_NOT_REACHED();
};
本部分主要根据请求和响应来判断是否需要存储到缓存中。主要判断 scheme、method 以及资源的缓存策略。
// WebKit/Source/WebKit/NetworkProcess/cache/NetworkCache.cpp
static StoreDecision makeStoreDecision(const WebCore::ResourceRequest& originalRequest, const WebCore::ResourceResponse& response, size_t bodySize)
{
if (!originalRequest.url().protocolIsInHTTPFamily() || !response.isInHTTPFamily())
return StoreDecision::NoDueToProtocol;
if (originalRequest.httpMethod() != "GET")
return StoreDecision::NoDueToHTTPMethod;
auto requestDirectives = WebCore::parseCacheControlDirectives(originalRequest.httpHeaderFields());
if (requestDirectives.noStore)
return StoreDecision::NoDueToNoStoreRequest;
if (response.cacheControlContainsNoStore())
return StoreDecision::NoDueToNoStoreResponse;
if (!WebCore::isStatusCodeCacheableByDefault(response.httpStatusCode())) {
// http://tools.ietf.org/html/rfc7234#section-4.3.2
bool hasExpirationHeaders = response.expires() || response.cacheControlMaxAge();
bool expirationHeadersAllowCaching = WebCore::isStatusCodePotentiallyCacheable(response.httpStatusCode()) && hasExpirationHeaders;
if (!expirationHeadersAllowCaching)
return StoreDecision::NoDueToHTTPStatusCode;
}
bool isMainResource = originalRequest.requester() == WebCore::ResourceRequest::Requester::Main;
bool storeUnconditionallyForHistoryNavigation = isMainResource || originalRequest.priority() == WebCore::ResourceLoadPriority::VeryHigh;
if (!storeUnconditionallyForHistoryNavigation) {
auto now = WallTime::now();
Seconds allowedStale { 0_ms };
#if ENABLE(NETWORK_CACHE_STALE_WHILE_REVALIDATE)
if (auto value = response.cacheControlStaleWhileRevalidate())
allowedStale = value.value();
#endif
bool hasNonZeroLifetime = !response.cacheControlContainsNoCache() && (WebCore::computeFreshnessLifetimeForHTTPFamily(response, now) > 0_ms || allowedStale > 0_ms);
bool possiblyReusable = response.hasCacheValidatorFields() || hasNonZeroLifetime;
if (!possiblyReusable)
return StoreDecision::NoDueToUnlikelyToReuse;
}
// Media loaded via XHR is likely being used for MSE streaming (YouTube and Netflix for example).
// Streaming media fills the cache quickly and is unlikely to be reused.
// FIXME: We should introduce a separate media cache partition that doesn't affect other resources.
// FIXME: We should also make sure make the MSE paths are copy-free so we can use mapped buffers from disk effectively.
auto requester = originalRequest.requester();
bool isDefinitelyStreamingMedia = requester == WebCore::ResourceRequest::Requester::Media;
bool isLikelyStreamingMedia = requester == WebCore::ResourceRequest::Requester::XHR && isMediaMIMEType(response.mimeType());
if (isLikelyStreamingMedia || isDefinitelyStreamingMedia)
return StoreDecision::NoDueToStreamingMedia;
return StoreDecision::Yes;
}
本部分主要根据请求来判断是否去缓存中读取缓存。主要判断 scheme、method 以及资源的缓存策略。
// WebKit/Source/WebKit/NetworkProcess/cache/NetworkCache.cpp
static RetrieveDecision makeRetrieveDecision(const WebCore::ResourceRequest& request)
{
ASSERT(request.cachePolicy() != WebCore::ResourceRequestCachePolicy::DoNotUseAnyCache);
// FIXME: Support HEAD requests.
if (request.httpMethod() != "GET")
return RetrieveDecision::NoDueToHTTPMethod;
if (request.cachePolicy() == WebCore::ResourceRequestCachePolicy::ReloadIgnoringCacheData && !request.isConditional())
return RetrieveDecision::NoDueToReloadIgnoringCache;
return RetrieveDecision::Yes;
}
本部分主要根据请求和响应来判断缓存是否可以直接使用。主要根据缓存字段计算当前的资源是否过期。
// WebKit/Source/WebKit/NetworkProcess/cache/NetworkCache.cpp
static UseDecision makeUseDecision(NetworkProcess& networkProcess, const PAL::SessionID& sessionID, const Entry& entry, const WebCore::ResourceRequest& request)
{
// The request is conditional so we force revalidation from the network. We merely check the disk cache
// so we can update the cache entry.
// 条件请求判断 | bool ResourceRequestBase::isConditional
if (request.isConditional() && !entry.redirectRequest())
return UseDecision::Validate;
// 校验变化的请求头 | verifyVaryingRequestHeaders
if (!WebCore::verifyVaryingRequestHeaders(networkProcess.storageSession(sessionID), entry.varyingRequestHeaders(), request))
return UseDecision::NoDueToVaryingHeaderMismatch;
// We never revalidate in the case of a history navigation.
// 校验缓存是否过期 | cachePolicyAllowsExpired
if (cachePolicyAllowsExpired(request.cachePolicy()))
return UseDecision::Use;
// 验证请求是否过期
auto decision = responseNeedsRevalidation(*networkProcess.networkSession(sessionID), entry.response(), request, entry.timeStamp());
if (decision != UseDecision::Validate)
return decision;
// 验证缓存有效字端(Etag等) | bool ResourceResponseBase::hasCacheValidatorFields()
if (!entry.response().hasCacheValidatorFields())
return UseDecision::NoDueToMissingValidatorFields;
return entry.redirectRequest() ? UseDecision::NoDueToExpiredRedirect : UseDecision::Validate;
}
本部分主要根据缓存字段计算当前的资源的新鲜度。
// WebKit/Source/WebCore/platform/network/CacheValidation.cpp
Seconds computeFreshnessLifetimeForHTTPFamily(const ResourceResponse& response, WallTime responseTime)
{
if (!response.url().protocolIsInHTTPFamily())
return 0_us;
// Freshness Lifetime:
// http://tools.ietf.org/html/rfc7234#section-4.2.1
auto maxAge = response.cacheControlMaxAge();
if (maxAge)
return *maxAge;
auto date = response.date();
auto effectiveDate = date.valueOr(responseTime);
if (auto expires = response.expires())
return *expires - effectiveDate;
// Implicit lifetime.
switch (response.httpStatusCode()) {
case 301: // Moved Permanently
case 410: // Gone
// These are semantically permanent and so get long implicit lifetime.
return 24_h * 365;
default:
// Heuristic Freshness:
// http://tools.ietf.org/html/rfc7234#section-4.2.2
if (auto lastModified = response.lastModified())
return (effectiveDate - *lastModified) * 0.1;
return 0_us;
}
}
本部分主要根据缓存字段计算当前的资源是否过期。
// WebKit/Source/WebKit/NetworkProcess/cache/NetworkCache.cpp
static UseDecision responseNeedsRevalidation(NetworkSession& networkSession, const WebCore::ResourceResponse& response, WallTime timestamp, Optional<Seconds> maxStale)
{
if (response.cacheControlContainsNoCache())
return UseDecision::Validate;
// 当前过去的时间 = 当前时间 - 资源时间 | computeCurrentAge
auto age = WebCore::computeCurrentAge(response, timestamp);
// 调用资源有效时间计算 | computeFreshnessLifetimeForHTTPFamily
auto lifetime = WebCore::computeFreshnessLifetimeForHTTPFamily(response, timestamp);
// 资源允许过期时间
auto maximumStaleness = maxStale ? maxStale.value() : 0_ms;
// qy6_detail 资源是否超期 | 当前过去的时间 - 资源有效时间 - 允许过期时间 > 0 => 资源过期了
bool hasExpired = age - lifetime > maximumStaleness;
#if ENABLE(NETWORK_CACHE_STALE_WHILE_REVALIDATE)
if (hasExpired && !maxStale && networkSession.isStaleWhileRevalidateEnabled()) {
auto responseMaxStaleness = response.cacheControlStaleWhileRevalidate();
maximumStaleness += responseMaxStaleness ? responseMaxStaleness.value() : 0_ms;
bool inResponseStaleness = age - lifetime < maximumStaleness;
if (inResponseStaleness)
return UseDecision::AsyncRevalidate;
}
#endif
if (hasExpired) {
#ifndef LOG_DISABLED
LOG(NetworkCache, "(NetworkProcess) needsRevalidation hasExpired age=%f lifetime=%f max-staleness=%f", age, lifetime, maximumStaleness);
#endif
return UseDecision::Validate;
}
return UseDecision::Use;
}
过期资源需要从服务器判断是否可用,需要构造一个条件请求去服务端验证当前过期资源是否可用。
// WebKit/Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp
void NetworkResourceLoader::validateCacheEntry(std::unique_ptr<NetworkCache::Entry> entry)
{
RELEASE_LOG_IF_ALLOWED("validateCacheEntry:");
ASSERT(!m_networkLoad);
// If the request is already conditional then the revalidation was not triggered by the disk cache
// and we should not overwrite the existing conditional headers.
// 如果请求为条件请求,不修改 HEADER 中条件请求属性
ResourceRequest revalidationRequest = originalRequest();
if (!revalidationRequest.isConditional()) {
String eTag = entry->response().httpHeaderField(HTTPHeaderName::ETag);
String lastModified = entry->response().httpHeaderField(HTTPHeaderName::LastModified);
// qy6_detail 新增缓存校验请求头,IfNoneMatch 和 IfModifiedSince
if (!eTag.isEmpty())
revalidationRequest.setHTTPHeaderField(HTTPHeaderName::IfNoneMatch, eTag);
if (!lastModified.isEmpty())
revalidationRequest.setHTTPHeaderField(HTTPHeaderName::IfModifiedSince, lastModified);
}
m_cacheEntryForValidation = WTFMove(entry);
// qy6_detail 发起请求
startNetworkLoad(WTFMove(revalidationRequest), FirstLoad::Yes);
}
当服务器验证通过后,需要对现有的缓存资源进行更新,缓存资源更新后返回给客户端。
// WebKit/Source/WebCore/platform/network/CacheValidation.cpp
void updateResponseHeadersAfterRevalidation(ResourceResponse& response, const ResourceResponse& validatingResponse)
{
// Freshening stored response upon validation:
// http://tools.ietf.org/html/rfc7234#section-4.3.4
for (const auto& header : validatingResponse.httpHeaderFields()) {
// Entity headers should not be sent by servers when generating a 304
// response; misconfigured servers send them anyway. We shouldn't allow
// such headers to update the original request. We'll base this on the
// list defined by RFC2616 7.1, with a few additions for extension headers
// we care about.
// 是否应该更新请求头
if (!shouldUpdateHeaderAfterRevalidation(header.key))
continue;
response.setHTTPHeaderField(header.key, header.value);
}
}
**参考资料**
1. 《HTTP 权威指南》
2. "HTTP 缓存 - HTTP | MDN"
https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Caching_FAQ
3. "Cache-Control - HTTP | MDN"
https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers/Cache-Control
4. "Message Syntax and Routing"
https://tools.ietf.org/html/rfc7230
5. "Semantics and Content"
https://tools.ietf.org/html/rfc7231
6. "Conditional Requests"
https://tools.ietf.org/html/rfc7232
7. "Range Requests"
https://tools.ietf.org/html/rfc7233
8. "Caching"
https://tools.ietf.org/html/rfc7234
9. "Authentication"
https://tools.ietf.org/html/rfc7235
Copyright© 2013-2020
All Rights Reserved 京ICP备2023019179号-8