This project has moved. For the latest updates, please go here.

How to remove duplicates from a Worksheet

Sep 18, 2012 at 11:51 PM

Hello,

A search returned nothing about this. Could someone please tell me how to remove duplicate rows in a worksheet?

Thanks,

Enzo

Coordinator
Sep 19, 2012 at 3:45 AM

Don't wait for it but I'll create a work item to include the following methods.

range.UniqueRows()

range.UniqueColumns()

column.UniqueCells()

row.UniqueCells()

range.DeleteDuplicateRows()

range.DeleteDuplicateColumns()

row.DeleteDuplicateCells()

column.DeleteDuplicateCells()

In the meantime you can just create a hash table and remove the cells/rows that are already in the hash.

Coordinator
Sep 19, 2012 at 3:46 AM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Sep 21, 2012 at 5:05 PM
    class Program
    {
        static void Main()
        {
            var wb = new XLWorkbook();
            var ws = wb.AddWorksheet("Sheet1");
            ws.Cell("A1").SetValue("Name")
                .CellBelow().SetValue("John")
                .CellBelow().SetValue("Bob")
                .CellBelow().SetValue("John");
 
            ws.Cell("B1").SetValue("Age")
            .CellBelow().SetValue(20)
            .CellBelow().SetValue(30)
            .CellBelow().SetValue(20);
 
            var uniques = new HashSet<IXLRangeRow>();
            var comparer = new XLRangeRowComparer();
            foreach (var row in ws.RangeUsed().Rows())
            {
                if (uniques.Contains(row, comparer))
                    row.Delete();
                else
                    uniques.Add(row);
            }
 
            wb.SaveAs(@"C:\MyFiles\Excel Files\Sandbox.xlsx");
        }
    }
 
    class XLRangeRowComparerIEqualityComparer<IXLRangeRow>
    {
        public bool Equals(IXLRangeRow x, IXLRangeRow y)
        {
            var xCount = x.CellCount();
            var yCount = y.CellCount();
            if (xCount != yCount) return false;
 
            // Replace this comparison with one that suits your needs
            for (var cellNo = 1; cellNo <= xCount; cellNo++)
                if (x.Cell(cellNo).GetString() != y.Cell(cellNo).GetString()) return false;
 
            return true;
        }
 
        public int GetHashCode(IXLRangeRow obj)
        {
            return obj.GetHashCode(); // Not used in this example
        }
    }