Securing a data set on allegations of sexual abuse made against the former disc jockey, Jimmy Savile

Smith, Mark; Llewellyn, Clare; Ruus, Laine; Kirkwood, Steve; Burnett, Ros

Sangam Home
→
Research Datasets
→
Edinburgh DataShare - University of Edinburgh
→
View Item

dc.contributor	ESRC - Economic and Social Research Council
dc.contributor	McDonnell, Diarmuid
dc.creator	Smith, Mark
dc.creator	Llewellyn, Clare
dc.creator	Ruus, Laine
dc.creator	Kirkwood, Steve
dc.creator	Burnett, Ros
dc.date	2017-08-31T13:15:05Z
dc.date	2017-08-31T13:15:05Z
dc.identifier	Smith, Mark; Llewellyn, Clare; Ruus, Laine; Kirkwood, Steve; Burnett, Ros. (2017). Securing a data set on allegations of sexual abuse made against the former disc jockey, Jimmy Savile, [dataset]. University of Edinburgh. School of Social and Political Science. Social Work. https://doi.org/10.7488/ds/2126.
dc.identifier	https://hdl.handle.net/10283/2809
dc.identifier	https://doi.org/10.7488/ds/2126
dc.description	## This dataset has been moved to the Edinburgh DataVault, where it is directly accessible only by authorised University of Edinburgh users. For further information please see https://www.research.ed.ac.uk/en/datasets/securing-a-data-set-on-allegations-of-sexual-abuse-made-against-t ## In this work we look at the initial phase of an ESRC funded project involving academics from Social Work, Criminology, Informatics and the University of Edinburgh Library.This project collected and analysed a data set on allegations of sexual abuse made against the former disc jockey, Jimmy Savile. The Savile affair has taken place in a public and highly charged, arena. It has generated massive media attention and spawned several public reports, most notably that which was produced as a result of Operation Yewtree. Early allegations against Savile emanate from former residents at Duncroft, a residential school for `wayward but intelligent young women'. This project stems from data produced and collected by the blogger `Anna Raccoon' herself a former resident at the school. Through her blogs on the subject of Savile and Duncroft she was contacted by others and has collected a variety of information on the subject. The data harvested from the blog are supplemented by official reports and other blogs.
dc.description	The initial component of the project involves capturing Anna Racoon’s blog (The Racoon Arms). This is a WordPress blog that was taken down by the author. Following previous research approaches [9, 8] we searched for copies of the site in other content management systems. We found that this site had been archived in several frozen states in the Internet Archive’s WayBackMachine (IA). An active blog is a constantly evolving object, and therefore careful consideration needs to be given as to what version or versions should be harvested. Given that the blog is available via the IA, one might question why it is necessary to download a copy at all. There are two main reasons for doing so. Firstly, the IA may at any time, and without notice, remove the objects from their archive. Secondly, to provide additional functionality to support qualitative analysis of the content of the blog, as well as indexing to support additional resource discovery not provided within the blog software or the IA. While harvesting the contents of a blog manually can be a long and arduous process, it can be simplified and automated using a software solution, such as wget. Apart from soliciting permission from the IA, decisions need to be made as to which version or versions should be harvested. Further decisions included to what level of recursion each harvest should be and whether just blog text or all files contributing to content and functionality of the blog should be gathered. Such decisions influence not only the size of the eventual object, but also the richness of the context. There are also concomitant draw-backs – the deeper the recursion, the greater the number of missing files (those that have not been harvested by the IA). Given that WordPress blogs are based on HTML format files, apart from any images and other audio-visual files that may be associated with the blogs, the text portion is in as efficient a format as possible vis-a-vis file storage as well as capacity to use XML to provide value added indexing and tagging. Storage capacity requirements depend largely on the number of snapshots of the blog that are harvested and the level of recursion specified in the harvests. The size of one snapshot can range from 53 MiB to 660 MiB (ranging from 1,500 to 88,000 files), depending on the options specified.
dc.format	application/zip
dc.format	application/pdf
dc.format	application/zip
dc.language	eng
dc.publisher	University of Edinburgh. School of Social and Political Science. Social Work
dc.rights	Creative Commons Attribution 4.0 International Public License
dc.subject	Jimmy Savile
dc.subject	sexual abuse
dc.subject	Duncroft
dc.subject	Social studies::Social Work
dc.title	Securing a data set on allegations of sexual abuse made against the former disc jockey, Jimmy Savile
dc.type	dataset

Files in this item

Files	Size	Format	View
admin_docs.zip	1.206Gb	application/zip	View/Open
A Shared Langua ... Sensitive Information.pdf	87.20Kb	application/pdf	View/Open
Savile KE Event Survey Responses 9.zip	49.67Kb	application/zip	View/Open

This item appears in the following Collection(s)

Edinburgh DataShare - University of Edinburgh [3461]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Securing a data set on allegations of sexual abuse made against the former disc jockey, Jimmy Savile

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection