Page 1 of 2
					
				Excel remove duplicates
				Posted: Fri Feb 23, 2024 3:05 am
				by Victorio
				Hi 
what is the easiest way to remove duplicate rows - rows that have the same selected columns, i.e. not all for the created Excel file?
I can also do it through the database using the UNIQUE index, but it's more complicated.
I'm trying what I found on the internet for VBA, but I don't know the syntax for Xbase++, it gives me an error
Code: Select all
oExcel:=CreateObject("Excel.Application")
IF Empty( oExcel )
	ladenie( "Microsoft Excel nie je nainštalovaný, Active X sa nedá vytvoriť !" )
ENDIF
oExcel:Visible := .f.
cExcelFile:="d:\PROGSA\RAUKNW\TLAC\MYLE3\POKUS.XLS"
IF !FExists(cExcelFile)
	DC_WinAlert( 'File does not exist:' + Chr(13) + cExcelFile )
ENDIF
oBook:=oExcel:workbooks:open( cExcelFile )	
aValues := oBook:workSheets(1):usedRange:value	// only for test if can read info from xls to array
ActiveSheet:UsedRange:RemoveDuplicates:Columns:=Array(1,2,3,4,5):Header:=xlYes
* I do not know if need set this :
*oSheet := oBook:Worksheets(1) 
*oSheet:Rng:RemoveDuplicates:Columns:=Array(1,2,3,4,5):Header:=xlYes
* or also this 
*Rng:=Range("A1:D1")
*oSheet:Rng:RemoveDuplicates:Columns:=Array(1,2,3,4,5):Header:=xlYes
*and what else ...
oBook:close()
oBook:destroy()
// Quit Excel
oExcel:Quit()
oExcel:Destroy()
wtf aValues
 
			
					
				Re: Excel remove duplicates
				Posted: Sun Feb 25, 2024 4:27 am
				by SlavkoDam
				Hi,
As you know, you cannot use syntax you tried from Internet, because its VB syntax and not Xbase++ syntax. For Xbase++ syntax you have to read Xbase++ AutomationObject() class documentation. I send you attached MS VBA documentation for Excel 2010, and a program to install to open that file type.
For example, if your defined range is oRange and your unique column numbers in the range are 1 and 2, you can try this:
oRange:RemoveDuplicates( {1, 2}, xlGuess )  or
oRange:callMethod( "RemoveDuplicates", {1, 2}, xlGuess )
			 
			
					
				Re: Excel remove duplicates
				Posted: Mon Feb 26, 2024 1:46 am
				by Victorio
				Hi, thanks for info, 
I tryed several versions of syntax but when I call RemoveDuplicates, get Error Automation/6500. Description : Invalid index, 0Error in array index
Operation ? RemoveDuplicates
for example : 
oUsedRange := oSheet:usedrange
oUsedRange:RemoveDuplicates( {1, 2}, xlGuess )
oUsedRange:RemoveDuplicates( {0, 1, 2}, xlGuess )
still get error, 
when try this :
oUsedRange:RemoveDuplicates( 1, xlGuess )
works , 
Here is something about array, do not know RemoveDuplicates wont accept array of columns...
this read sheet do array without problems
aValues := oBook:workSheets(1):usedRange:value
			 
			
					
				Re: Excel remove duplicates
				Posted: Mon Feb 26, 2024 2:39 am
				by Victorio
				everything looks to RemoveDuplicates(param1,param2)  not accept param1 as Array, but only number of one column.
I do not understand why, or how syntax need use to process
			 
			
					
				Re: Excel remove duplicates
				Posted: Mon Feb 26, 2024 5:02 pm
				by Auge_Ohr
				hi,
have you try
Code: Select all
ActiveSheet.Range("A1:C100").RemoveDuplicates
The following code sample removes duplicates with the first 2 columns
Code: Select all
ActiveSheet.Range("A1:C100").RemoveDuplicates Columns:=Array(1,2), Header:=xlYes
 
			
					
				Re: Excel remove duplicates
				Posted: Tue Feb 27, 2024 3:03 am
				by Victorio
				Hi Jimmy,
This syntax is not correct for Xbase++
ActiveSheet.Range("A1:C100").RemoveDuplicates Columns:=Array(1,2), Header:=xlYes
I tryed ActiveSheet.Range("A1:C100"):RemoveDuplicates(Columns:=Array(1,2), Header:=xlYes)
but Xbase return Error Automation/6500
The parameter is incorrect.
0
Operation : RemoveDuplicates
Operating system error : -2147024809
this is some problem with index.
where sign "." is replaced to ":", but also there is problem with Array() fnc.  if I understand good, in examples on net Array() is used as function to create array from several variables, Xbase++ function Array() only create blank array with blank elements , Array(1,2) create two dimensional array without elements, and not column 1 and 2.
Or am I wrong  ?
At this point, I gave up and started working on an edit where I filter out duplicate records using a DBF auxiliary database re-indexed with a UNIQUE index key. Although I don't know if the index will take 37 fields with a total key length of about 854 characters. ?
			 
			
					
				Re: Excel remove duplicates
				Posted: Tue Feb 27, 2024 3:59 am
				by SlavkoDam
				Hi,
Its obvious that Xbase++ AutomationObject don't convert Xbase++ array to VB array in the right way.
You can use the following manual algorithm for deleting duplicate rows on the first 2 columns, without RemoveDuplicates method. It will do the job as well.
Code: Select all
FOR I = 1 TO oRange:Rows:Count
  IF I = oRange:Rows:Count
    EXIT
  ELSEIF oRange:Cells(I, 1):Value = oRange:Cells(I + 1, 1):Value .AND. oRange:Cells(I, 2):Value = oRange:Cells(I + 1, 2):Value
    oRange:Cells(I, 1):EntireRow:Delete
  ENDIF
NEXT
 
			
					
				Re: Excel remove duplicates
				Posted: Tue Feb 27, 2024 4:07 am
				by Victorio
				SlavkoDam : 
thank you very much, I will try it.
Viktor
			 
			
					
				Re: Excel remove duplicates
				Posted: Tue Feb 27, 2024 4:22 am
				by Victorio
				looks this not work as I want. duplicates can be in different places of the table, not only the following records.
then if this can work, must first sort records.
also work your example with "." and not only with ":"  ?
I modify this as here :
FOR I = 1 TO oRange:Rows:Count
IF I = oRange:Rows:Count
EXIT
ELSEIF oRange:Cells(I, 1):Value = oRange:Cells(I + 1, 1):Value .AND. oRange:Cells(I, 2):Value = oRange:Cells(I + 1, 2):Value
oRange:Cells(I, 1):EntireRow:Delete
ENDIF
NEXT
			 
			
					
				Re: Excel remove duplicates
				Posted: Tue Feb 27, 2024 4:49 am
				by SlavkoDam
				Hi,
Yes, "." have to be replaced with ":", its Xbase++ syntax.
Yes, duplicates can be in different places of the table, not only the following records. So, first you have to sort range with Sort method. Here is an example for the first 2 columns.
ActiveSheet:Range("A1:C1000"):Sort(ActiveSheet:Range("A1"),  , ActiveSheet:Range("B1"))
You can also try this modified code which scans all the following rows for duplicate rows and deletes them.
Code: Select all
FOR I = 1 TO oRange:Rows:Count
  IF I = oRange:Rows:Count
    EXIT
  ENDIF
  FOR J = I + 1 TO oRange:Rows:Count
    IF oRange:Cells(I, 1):Value = oRange:Cells(J, 1):Value .AND. oRange:Cells(I, 2):Value = oRange:Cells(J, 2):Value
      oRange:Cells(J, 1):EntireRow:Delete
    ENDIF
  NEXT
NEXT
Slavko