Questions tagged as 'big-data'

4
answers

Strategies for analyzing very large databases in R (that do not fit in RAM)

Suppose I have a huge database that does not fit into RAM. What strategies to analyze this database in R, since I can not fully load it into memory? PS: The question is not just about how to make R talk to a relational / non-relational databa...
asked by 28.08.2014 / 06:44
3
answers

Pre-process large text files in R

I am writing a script, which I will make public, to open the RAIS microdata (unidentified, available here ) in R using MonetDB. However, the bank does not accept a vignette (,) as a decimal separator. Each RAF UFano.txt file is quite large (up...
asked by 09.10.2014 / 22:52
1
answer

Is there a way to open a straight SQL table in a data.table without doing the SQL data.frame data.table path?

I want to open a straight SQL table in a data.table. When I query with dbGetQuery , what I get is a data.frame. I know I can later turn that data.frame into a data.table easily. But I'd like to skip this step - which on some occasions may...
asked by 19.09.2014 / 13:27
1
answer

Log Storage / Indexing Tools

I need to create a system to store logs from various user actions (this system will create millions of weekly data). So what tools are available for this type of need? PS: It is extremely necessary to provide an API to search the logs, a...
asked by 20.09.2017 / 15:41
2
answers

Hadoop is a Database? What is Hadoop?

After all, what is Hadoop? Is Hadoop a database? I have often heard "that company uses the Hadoop database". But when I started to study Big Data I saw that things were not really like that. So if it's not a database, what is it?     
asked by 25.08.2017 / 21:33
1
answer

MySQL table with lots of data

Example case: I have a system with approximately 5 million records, it is recommended to use a single MySQL table to save, for example tabela_empresas_brasil ? since every time the user makes a query will need to search those 5 million r...
asked by 14.09.2016 / 17:21
1
answer

Insert data into BigQuery daily

I have a routine that returns me 2900 rows daily. This information is sent to BigQuery. Is there a way to send these 2900 rows to a BigQuery table today and add another 2900 rows to that table?     
asked by 29.03.2017 / 21:34
1
answer

Selecting different intervals in a giant dataframe in RStudio

I have a large CSV with large stock dates and their closing prices, impossible to use Excel. The action name is in the same column as the date and only appears at the beginning of the series, as shown below: I have limited knowledg...
asked by 07.12.2018 / 16:17
1
answer

Nifi consuming all available disk space

I have some processes that search on a SQL database and play to a queue In the queue I've limited the size to 100mb and the amount by 10, but even so NiFi ignores and allocates everything at once, generating a queue of more than 80GB. I'm...
asked by 06.08.2018 / 16:59
1
answer

How to sweep data like google?

Good evening how can I make an application that basically works like google, I want to type a keyword and receive results on, or pull data from google itself, which language and path should I use?     
asked by 28.04.2018 / 00:37