first draft of paper;

git-svn-id: https://wikiteam.googlecode.com/svn/trunk@908 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
pull/117/head
emijrp 10 years ago
parent 95c2228f36
commit a88797717c

@ -0,0 +1,105 @@
\documentclass[11pt,twocolumn]{article}
\setlength{\columnsep}{0.5cm}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[english]{babel}
\usepackage{hyperref}
\usepackage{graphicx}
\usepackage{natbib}
\title{\vspace{-15mm}
\fontsize{24pt}{10pt}\selectfont
\textbf{WikiTeam: collaborative preservation of wikis}
}
\author{
\large
\textsc{Emilio J. Rodríguez-Posada, Federico Leva, Luiz Augusto} \\
\normalsize WikiTeam \\
\normalsize \{\href{mailto:emijrp@gmail.com}{emijrp}, \href{mailto:nemowiki@gmail.com}{nemowiki}, \href{mailto:lugusto@gmail.com}{lugusto}\}@gmail.com
\vspace{-5mm}
}
\date{}
\begin{document}
\twocolumn[
\begin{@twocolumnfalse}
\maketitle
\begin{abstract}
Internet users have an increasingly role in web content creation. There are initiatives and solutions for the digital preservation of the web, including the well-known Internet Archive, but they are inefficient for archiving user-generated content in social networks and wikis. In this article we explore the problems in wiki preservation, the lack of tools to achieve successfully this task and we present and assess WikiTeam, the solution that we have built. WikiTeam is a collaborative effort to develop and run software for digital preservation of wikis. As of January 2014, we have extract the texts, histories, images and metadata for more than 4,500 stand-alone wikis, several wikifarms and 24TB of Wikimedia Commons files. The preserved content represents an huge cumule of datasets of the wikisphere, with an incalculable historical and research value.
\\
\\
\textbf{Keywords:} web digital preservation, social web archiving, archiving applications and systems
\end{abstract}
\end{@twocolumnfalse}
]
\section{Introduction}
This is a general introduction on web archiving.
Ideas (write yours):
\begin{itemize}
\item brief description and importance of digital preservation, Internet Archive and Wayback
\item user-generated content explosion, social networks, wikis, Archive Team, LoC archiving Twitter
\item Wikipedia 2001, people start to use MediaWiki for their wikis, wikifarms,
\item wikis not only are text and files, but interesting metadata, histories. IA preserves text/files (and not all/always), but it is inefficient saving histories and metadata
\item as most wikis are free-licensed, there are no issues preserving this content
\end{itemize}
\section{Digital preservation of wikis}
This is a section for specifically wikis (an area inside web archiving).
Ideas (write yours):
\begin{itemize}
\item lack of public dumps/mirrors
\item lost wikifarms (ScribbleWiki)
\item existing software: wikitravel scripts oxygenpump %http://wikitravel.org/en/Wikitravel_talk:Database_dump %http://code.google.com/p/oxygenpump/
\item proposals: Urobe
\item other: manually export with Special:Export, or ad-hoc scripts
\end{itemize}
\section{WikiTeam: digital preservation of the wikisphere}
Here we are.
Ideas (write yours):
\begin{itemize}
\item presentation of WikiTeam
\item achievements (statistics, wikifarms, commons)
\item single backups and backups in batches (launcher)
\item uploaded to IA (long-time preservation, bittorrent webseed)
\item how we generate lists of wikis and reuse others (Pavlo list)
\item WikiApiary partnership
\end{itemize}
\section{Conclusions and future work}
Ideas (write yours):
\begin{itemize}
\item sumarizing the best
\item current issues and possible solutions
\end{itemize}
\bibliographystyle{wink}
\bibliography{wikiteam-2014}
\section*{Acknowledgements}
\section*{License}
This work has a license \href{http://creativecommons.org/licenses/by-sa/3.0/}{Creative Commons Attribution-ShareAlike 3.0 Unported}.
\end{document}

@ -0,0 +1,17 @@
@article{okoli2009,
author = "Chitu Okoli",
doi = "10.1109/ICDS.2009.28",
journal = "Third International Conference on Digital Society",
title = "A Brief Review of Studies of Wikipedia in Peer-Reviewed Journals",
url = "http://chitu.okoli.org/images/stories/bios/pro/research/open/Okoli2009ICDS.pdf",
year = "2009",
}
Niu, J. (2012). An Overview of Web Archiving. D-Lib Magazine, Volume 18, Issue 3/4 <http://www.dlib.org/dlib/march12/niu/03niu1.html>
Popitsch, N., Mosser, R. and Philipp, W. (2010). Urobe: a prototype for wiki preservation.
Rodríguez, E. J. et al. (2011). WikiTeam - We archive wikis, from Wikipedia to tiniest wikis. https://code.google.com/p/wikiteam
Stuckman, J. and Purtilo, J. (2009). Measuring the Wikisphere. WikiSym 09.
Es un estudio de 151 wikis de Internet que usan MediaWiki. A través de una herramienta que descarga los datos, WikiCrawler, se analizan estas comunidades. Descubre que en los wikis estudiados se repite un patrón en el que los niveles de contribución de sus usuarios son muy desiguales, con una pequeña cantidad de usuarios aportando una gran cantidad de trabajo.

File diff suppressed because it is too large Load Diff
Loading…
Cancel
Save