02347nas a2200337 4500000000100000008004100001260005900042653002000101653002200121653003500143653002300178653001600201653001800217653002000235653002000255653002600275653002600301653002100327653001700348653002600365653003100391653001700422653001800439100001900457700001600476245010500492856015600597300001200753520121900765020002501984 d bInstitute of Electrical and Electronics Engineers Inc.10aArchiving datum10acultural heritage10aCultural heritage preservation10aCultural heritages10aDaily lives10aData cleaning10aData collecting10aDigital storage10aHeritage preservation10aHistoric preservation10aQuery processing10aSocial media10aSocial media networks10aSocial networking (online)10aWeb Scraping10aWeb scrapings1 aShaimaa Rashid1 aRawaa Qasha00aExtracting and Archiving Data from Social Media to Support Cultural Heritage Preservation in Nineveh uhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85129952423&doi=10.1109%2fCSASE51777.2022.9759782&partnerID=40&md5=878387f6d3a7d39f809819c9997e14ea a295-3003 aDuring the last decades, various aspects of Nineveh s cultural heritage have been destroyed during wars or natural causes. Therefore, the needs to preserve these valuable heritages become crucial. With the increased use of the Internet, Social media networks have become part of peoples daily lives for publicly sharing information, including their feelings, opinion expression, knowledge, and sharing images, videos, audio, and even their locations. This paper aims to gather Nineveh s cultural heritage data from different social media sites. We prepare it to be used for supporting the preservation of the cultural heritage process, including both tangible and intangible heritage. With social media data, python programming language, and web scraping, various data types can be fetched from different heterogeneous sources such as Twitter, YouTube, etc., depending on several keywords and hashtags. Once the data is collected, several pre-processing operations are implemented to clean, organize and archive the resulted data in the NoSQL database and Amazon Simple Storage Service (Amazon S3). The archived cleaned information can be used later to query, browse, analyze and visualize the target information. a9781665426329 (ISBN)