Convert rows to columns (Pivot?)

9

I have a table with 33 million phone records with the structure below:

ccpf_cnpj   ddd telefone    tipo    origem
11111111111 83  81021111    M   SERASA
11111111111 83  87472222    M   SERASA
11111111111 83  81023333    M   TRANSUNION
11111111111 83  88724444    M   TRANSUNION
11111111111 83  87475555    M   TRANSUNION
22222222222 43  36271111    F   SERASA
22222222222 44  36272222    F   SERASA
22222222222 43  36273333    F   TRANSUNION
22222222222 43  36284444    F   TRANSUNION
33333333333 51  51811111    F   SERASA
33333333333 51  56212222    F   SERASA
33333333333 51  96213333    M   SERASA

I need to "denormalize" this table so that each CPF has a single record, it should look like this:

CCPF_CNPJ   DDD_1   TELEFONE_1  TIPO_1  ORIGEM_1    DDD_2   TELEFONE_2  TIPO_2  ORIGEM_2    DDD_3   TELEFONE_3  TIPO_3  ORIGEM_3    DDD_4   TELEFONE_4  TIPO_4  ORIGEM_4    DDD_5   TELEFONE_5  TIPO_5  ORIGEM_5
11111111111 83  81021111    M   SERASA  83  87472222    M   SERASA  83  81023333    M   TRANSUNION  83  88724444    M   TRANSUNION  83  87475555    M   TRANSUNION
22222222222 43  36271111    F   SERASA  44  36272222    F   SERASA  43  36273333    F   TRANSUNION  43  36274444    F   TRANSUNION  NULL    NULL    NULL    NULL
33333333333 51  51811111    F   SERASA  51  56212222    F   SERASA  51  96213333    M   SERASA  NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL

A colleague here suggested using a Pivot + Unpivot, but I could not adapt to the need, or maybe I do not know how to use the Pivot for this case, usually when I use a field that defines the columns, which is not the case .

I was able to do a sampling (100 records) using joins with the table itself, but the solution became unfeasible due to the volume of data (remembering: 33 million records).

Has anyone ever had anything similar?

The SQL Server version here is 2014.

Thank you in advance.

    
asked by anonymous 11.02.2016 / 13:05

3 answers

2

I've adapted this answer to your need. It uses not only PIVOT , but also dynamic query construction because the number of columns will be undefined:

DECLARE 
  @MaxCount INT, 
  @coluna_ddd CHAR(3) = 'ddd', @ddd NVARCHAR(MAX), 
  @coluna_telefone CHAR(8) = 'telefone', @telefone NVARCHAR(MAX), 
  @coluna_tipo CHAR(4) = 'tipo', @tipo NVARCHAR(MAX), 
  @coluna_origem CHAR(6) = 'origem', @origem NVARCHAR(MAX), 
  @colunas_dinamicas VARCHAR(MAX), @sentenca_pivot VARCHAR(MAX),
  @i INT, @j NVARCHAR(10)

SELECT @MaxCount = MAX(r), @i = 1 FROM (SELECT CCPF_CNPJ, row_number() OVER (PARTITION BY CCPF_CNPJORDER BY CCPF_CNPJ) r FROM Tabela) T
WHILE @i <= @MaxCount
    SELECT @j = cast(@i AS NVARCHAR(10)), 
           @ddd = COALESCE(@ddd + ', ', '') + '['+@coluna_ddd + '_' + @j + ']', 
           @telefone = COALESCE(@telefone + ', ', '') + '[' + @coluna_telefone + '_' + @j + ']', 
           @tipo = COALESCE(@tipo + ', ', '') + '[' + @tipo + '_' + @j + ']', 
           @origem = COALESCE(@origem + ', ', '') + '[' + @origem + '_' + @j + ']', 
           @colunas_dinamicas = COALESCE(@colunas_dinamicas + ', ', '') + '[' + @coluna_ddd + '_' + @j + '], [' + @coluna_telefone + '_' + @j + '], [' + @coluna_tipo + '_' + @j + '], [' + @coluna_origem + '_' + @j + ']', 
           @i = @i + 1

SELECT 
    @sentenca_pivot = 'SELECT * FROM (SELECT CCPF_CNPJ, ''@2_'' + CAST(row_number() OVER (PARTITION BY CCPF_CNPJ ORDER BY CCPF_CNPJ) AS VARCHAR) A, @2 FROM Tabela) T PIVOT (MAX(@2) FOR A IN (@1)) PVT',
    @colunas_dinamicas = 'SELECT A1.CCPF_CNPJ, ' + @colunas_dinamicas +' FROM (' + REPLACE(REPLACE(@sentenca_pivot, '@1', @ddd), '@2', @coluna_ddd)+') A1
    LEFT OUTER JOIN ('+REPLACE(REPLACE(@sentenca_pivot, '@1', @telefone), '@2', @coluna_telefone)+') A2 ON A1.CCPF_CNPJ = A2.CCPF_CNPJ
    LEFT OUTER JOIN ('+REPLACE(REPLACE(@sentenca_pivot, '@1', @tipo), '@2', @coluna_tipo)+') A3 ON A1.CCPF_CNPJ = A3.CCPF_CNPJ
    LEFT OUTER JOIN ('+REPLACE(REPLACE(@sentenca_pivot, '@1', @origem), '@2', @coluna_origem)+') A4 ON A1.CCPF_CNPJ = A4.CCPF_CNPJ'

EXEC(@colunas_dinamicas)
    
11.02.2016 / 15:21
1
SELECT Campo1,Exemplo1,Exemplo2,Exemplo3
FROM
(
select Campo1,Campo2,'Exemplo' +convert(varchar(10),ROW_NUMBER () over (partition by Campo1 Order By Campo2)) as Sequencia from Tabela1

)Apelido1
PIVOT
(
MAX(Campo2)
for Sequencia in (Exemplo1,Exemplo2,Exemplo3)
)Apelido2
    
16.02.2016 / 19:54
0

Look, this works here. In this case I am assuming that the CPF that has the largest telephone number will be left, if you have another column that makes more sense, just change:

FILTRO = ROW_NUMBER() OVER(PARTITION BY CCPF_CNPJ order by **TELEFONE** desc),  

by:

FILTRO = ROW_NUMBER() OVER(PARTITION BY CCPF_CNPJ order by **ID** desc),  

or

FILTRO = ROW_NUMBER() OVER(PARTITION BY CCPF_CNPJ order by **DATA** desc),  

etc ...

WITH DADOS AS(  
 SELECT   
     FILTRO = ROW_NUMBER() OVER(PARTITION BY CCPF_CNPJ order by TELEFONE desc),   
     CCPF_CNPJ,   
     DDD,   
     TELEFONE,   
     TIPO,   
     ORIGEM  
 FROM TABELA
)  
SELECT * FROM DADOS
WHERE FILTRO = 1  
    
22.06.2016 / 19:05