Home

A case study of file system workload in a large-scale distributed environment


Author(s) : Harjinder S. Sandhu Songnian Zhou Deepinder S. Gill, 
Publisher : N/A
Publication Date : 1994
ISSN : N/A
Abstract : As large scale distributed file systems come into wide spread use supporting industrial organizations, the performance and scalability of file systems becomes important issues. Key factors such as file server bottlenecks and network congestion are heavily dependent on file system workload characteristics, yet no workload study on such systems has been reported to date. In this paper, we present a case study of a large industrial distributed file system supporting several thousand users. Our study is at the workgroup level, rather than at individual file access level. Our study reveals substantial departures from observations made in previous academic and centralized commercial environments. The average file size is 25-30 Kbytes, much larger than those from previous studies. Over weekly periods, we found that more than one-third of the files are active. Of the active files, read activities dominate writes 9:1. We found a very high level of file sharing: 50-60 % of the files are shared among two or more workgroups. On the other hand, a single workgroup remains as the only writer for 97 % of the write active files, suggesting that dynamic file replication at the workgroup level may effectively improve file access performance. It is also noted that shared files tend to be larger, and that write activities decrease as file sharing increases. We believe that these observations are representative of an important class of industrial computing environments, and are valuable input to future file system design.,