如何高效比较两个产品列表。

huangapple 未分类评论49阅读模式
英文:

How to efficiently compare two lists of Products

问题

我正在尝试编写一个方法,以高效的方式检查两个产品列表是否相等。

List<Product> firstList = getProductsListFromSomewhere();
List<Product> secondList = getProductsListFromSomewhereElse();

public boolean areListsEqual(List<Product> firstList, List<Product> secondList) {
    ...
}

约束和条件

  • 同一产品可以在列表中多次出现,例如 (Product A, Product B, Product A, Product C)。如果我使用 HashSet 来存储第一个列表的内容,然后解析第二个列表以检查每个产品是否在集合中,这可能会造成问题,因为 HashSet 无法存储重复项。
  • 如果两个列表包含相同的产品,且这些产品出现的次数相同,但它们的顺序不相关,则这两个列表被视为相等。例如以下两个列表:(Product A, Product B, Product A, Product C) 和 (Product C, Product A, Product A, Product B) 被认为是相等的。但以下两个列表:(Product A, Product B, Product A, Product C) 和 (Product A, Product B, Product C) 被认为是不同的。
  • 产品对象的定义如下(注意,它的代码是自动生成的,因此我无法在其类内部编写 equalshashcode 方法):
class Product {
    private String name;
    private Integer quantity;
    private List<Discount> discountsList;

    // 其他用于比较不必要的字段
}
  • 如果两个产品具有相同的 name、相同的 quantity 和相同的 discountsList,则它们被视为相等。
  • 对于折扣列表的比较,元素的顺序也不相关。
  • 折扣的定义如下(同样,这种情况下的类是自动生成的,我不能编写 equalshashcode 方法):
class Discount {
    String code;

    // 其他用于比较不必要的字段
}
  • 如果两个折扣具有相同的代码,则它们被视为相等。

要求和偏好

  • 比较必须高效(可能需要使用某种哈希算法)。
  • 代码应尽可能清晰(最好避免使用反射之类的方法来解析结构)。
  • 如果可能的话,我宁愿不使用外部库。

我的(无效的 如何高效比较两个产品列表。 )方法

我开始草拟了一个可能的解决方案,但我发现了不同的问题,并且不知道是否应该在某种方式上进行调整,或者完全重新考虑。我的想法是在执行比较的类内部扩展 Product 类:

List<Product> firstList = getProductsListFromSomewhere();
List<Product> secondList = getProductsListFromSomewhereElse();

public boolean areListsEqual(List<Product> firstList, List<Product> secondList) {
    ...
}  

private class ComparableProduct extends Product {
    // equals 和 hashCode 方法的重写
}

这种方法显然行不通,因为折扣对象在没有定义 equals 和 hashCode 方法的情况下进行比较,但我不能扩展 Discount,因为 Product 对象中定义的 discountList 是 Discount 类型的,因此无法使用可能创建的 ComparableDiscount。此外,我不确定一旦定义了哈希机制,最佳的方法/数据结构是什么,用于检查两个列表是否相等。

您能帮我以最佳方式完成这部分代码吗?

英文:

I'm trying to write a method that checks in an efficient way if 2 lists of products are equals.<BR>

List&lt;Product&gt; firstList = getProductsListFromSomewhere();
List&lt;Product&gt; secondList = getProductsListFromSomewhereElse();

public boolean areListsEqual(List&lt;Product&gt; firstList, List&lt;Product&gt; secondList) {
    ...
}

Constraints and conditions<BR>

  • The same product can appear multiple times in the list. Ex. (Product A, Product B, Product A, Product C)<BR> This could represent a problem if I use an HashSet to store the content of the first list and then parse the second list to check if each Product is in the set because I can't put duplicates in the HashSet.

  • The two lists are considered equals if they contains the same products and those appear the same number of times but their order is NOT relevant.<BR> So for example those two lists <BR>(Product A, Product B, Product A, Product C)<BR> (Product C, Product A, Product A, Product B)<BR> are considered equals.<BR><BR>But those two<BR>(Product A, Product B, Product A, Product C)<BR>(Product A, Product B, Product C)<BR>are considered different

  • The object Product is defined as follows (note its code is autogenerated so I can't write the methods equals and hashcode inside its class)

     class Product {
         private String name;
         private Integer quantity;
         private List&lt;Discount&gt; discountsList;
    
         //some other field not needed for the comparison
     }
    
  • Two products are considered equals if they have the same name, the same quantity and the same discountsList

  • Also for the lists of discounts comparison the element's order is NOT relevant

  • Discount is defined like this (also in this case the class is autogenerated and I can't write the methods equals and hashcode)

     class Discount {
         String code;
    
         //some other field not needed for the comparison
     }
    
  • Two discounts are considered equals if they have the same code

Requirements and preferences<BR>

  • The comparison has to be efficient (I guess I have to use some kind of hashing)

  • The code should be as clean as possible (I would prefer to avoid to use things like reflection to parse the structure etc)

  • If possible I would prefer NOT to use external libraries

My (not valid 如何高效比较两个产品列表。 ) approach<br>
I started to write a draft of a possible solution but I found different blockers to my approach and I don't know if I should refine it in some way or completely rethink about it.<BR>My idea is to extend the Product class inside the class that should perform the comparison:

List&lt;Product&gt; firstList = getProductsListFromSomewhere();
List&lt;Product&gt; secondList = getProductsListFromSomewhereElse();

public boolean areListsEqual(List&lt;Product&gt; firstList, List&lt;Product&gt; secondList) {
    ...
}  
  
private class ComparableProduct extends Product {

  @Override
  public boolean equals(Object obj) {
    if (this == obj) {
      return true;
    }
    if (obj == null) {
      return false;
    }
    if (getClass() != obj.getClass()) {
      return false;
    }
    final ComparableProduct other = (ComparableProduct)obj;
    if (!Objects.equals(this.name, other.name)) {
      return false;
    }
    if (!Objects.equals(this.quantity, other.quantity)) {
      return false;
    }
    if (!Objects.equals(this.discountList, other.discountList)) {
      return false;
    }
    return true;
  }

  @Override
  public int hashCode() {
    int hash = 3;
    hash = 79 * hash + Objects.hashCode(this.name);
    hash = 79 * hash + Objects.hashCode(this.quantity);
    hash = 79 * hash + Objects.hashCode(this.discountList);
    return hash;
  }
}

This approach obviously doesn't work because the Discount object is compared without defining the equals and hashCode methods but I can't extend Discount because the discountList defined in the Product object is of Discount type so I can't use the ComparableDiscount eventually created. <BR> Furthermore I don't know exactly what is the best way/data structure to use, once the hashing mechanism is defined, to check that the two lists are equals<BR><BR>
Would you help me to complete this part of code in the best possible way?

答案1

得分: 0

最简单的方法是编写一个函数,该函数接受一个Product对象并生成一个唯一的字符串表示。确保如果您认为两者相同,则生成相同的字符串输出。(例如,对折扣代码进行排序。)

现在,您可以将Product对象的List转换为字符串的List。现在您可以相对容易地比较这两个列表。

如果这些列表可能很大,一个提示是实际上使用描述的MD5哈希而不是描述本身。这些将更短,碰撞的几率极低。

如果您想要实际识别差异,您应该存储表示产品的字符串到产品对象的映射。这样,一旦您知道哪些字符串在一个列表中而不在另一个列表中,您就可以在返回它们之前将字符串转换回对象。

英文:

The easiest approach is to write a function that takes a Product and generates a unique string representation of it. Be sure that if you consider two the same, you must get the same string out. (For example sort the discount codes.)

Now you can turn a List of Product objects into a List of strings. You can now compare two of these lists fairly easily.

One tip if these can be big is to actually work with an MD5 hash of the description rather than the description itself. Those will be shorter and the odds of a collision are astronomically low.

If you want to actually identify the differences, you should store a map of the string representing the product to the product object. That way once you know which strings are in one list but not the other, then you can turn strings back into objects before returning them.

huangapple
  • 本文由 发表于 2020年4月11日 00:23:30
  • 转载请务必保留本文链接:https://java.coder-hub.com/61144419.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定