Mobile Cloud Storage Dataset
We contribute to research community a dataset consisting of HTTP-level request logs from storage front-end servers of a mobile cloud storage service. The service is very similar to Google Drive. The dataset spans one week and consists of about 350 million HTTP request logs. Each day corresponds to a single plain-text file as listed below. We have also adapted a small sample file consisting of several thousands lines, to facilitates fast browsing of dataset format.
In particular, each line corresponds to a HTTP request with 10 fields.
- Timestamp in seconds: relative to the first request in the dataset.
- Mobile device type: 0 for Android, 1 for iOS.
- Device ID (anonymized): numerical ID that uniquely identifies a mobile device.
- User ID (anonymized): numerical ID that uniquely identifies a registered user. A user might use several devices.
- Request type: 0 for file storage operation request, 1 for file retrieval operation request, 2 for chunk uploading request, 3 for chunk downloading request.
- Data volume: the volume of uploaded (resp. downloaded) data for a storage (resp. retrieval) request.
- Request processing time: the duration between the first bytes received by front-end server and the last bytes sent to mobile client.
- Upstream response Time: the time spent in storing/preparing the requested content by upstream storage servers, i.e., the servers that physically host the data. This value is missed in some logs. In this case, the time is '-1'.
- RTT: the average of all RTTs measured for the TCP connection on which the HTTP request is transferred. If missed, '-1' is assigned.
- Proxied or not: whether the request is proxied or not, obtained from the HTTP header X-FORWARDED-FOR. 0: not; 1: proxied.