logo
down
shadow

java cosine similarity problem


java cosine similarity problem

By : user3848302
Date : October 26 2020, 11:51 AM
wish help you to fix your issue I'm not sure of your implementation but the cosine distance of two vectors is equal to the normalized dot product of those vectors.
The dot product of two matrix can be expressed as a . b = aTb. As a result if the matrix have different length you can't take the dot product to identify the cosine.
code :


Share : facebook icon twitter icon
Cosine Similarity in Java

Cosine Similarity in Java


By : user3255226
Date : March 29 2020, 07:55 AM
around this issue You want to compute the similarities between the given row and each row in the Matrix. Hence, inner product and norms must be computed getRowDimension times.
But the initializations are in the wrong place - move them into the loop over all rows.
code :
private ArrayList cosineSimilarity(int rowIndex, Matrix D) {
    ArrayList<Double> similarRows = new ArrayList<>();

    for(int row = 0; row < D.getRowDimension(); row++){
        double dotProduct = 0.0, firstNorm = 0.0, secondNorm = 0.0;
        for (int column = 0; column < D.getColumnDimension(); column++) {
        dotProduct += (D.get(rowIndex, column) * D.get(row, column));
        firstNorm += pow(D.get(rowIndex, column),2);
        secondNorm += pow(D.get(row, column), 2);
       // Matrix f = D.getMatrix(row, column);
        }
        double cosinSimilarity = (dotProduct / (sqrt(firstNorm) * sqrt(secondNorm)));
        similarRows.add(row, cosinSimilarity);
    }
Choice between an adjusted cosine similarity vs regular cosine similarity

Choice between an adjusted cosine similarity vs regular cosine similarity


By : Greg King
Date : March 29 2020, 07:55 AM
Hope this helps
Why would a regular cosine similarity result in a positive number for such 'different' items?
code :
from scipy import spatial
import numpy as np
a = np.array([2.0,1.0])  
b = np.array([5.0,3.0])
1 - spatial.distance.cosine(a,b)
#----------------------
# 0.99705448550158149
#----------------------
c = np.array([5.0,4.0])
1 - spatial.distance.cosine(c,b)
#----------------------
# 0.99099243041032326
#----------------------
mean_ab = sum(sum(a,b)) / 4  
# mean_ab : 3.5
# adjusted vectors : [-1.5, -2.5] , [1.5, -0.5]
1 - spatial.distance.cosine(a - mean_ab, b - mean_ab)
#----------------------
# -0.21693045781865616
#----------------------
mean_cb = sum(sum(c,b)) / 4
# mean_cb : 6.5
# adjusted vectors : [-1.5, -3.5] , [-1.5, -2.5]
1 - spatial.distance.cosine(c - mean_cb, b - mean_cb)
#----------------------
# 0.99083016804429891
#----------------------
Python, Cosine Similarity to Adjusted Cosine Similarity

Python, Cosine Similarity to Adjusted Cosine Similarity


By : SL3
Date : March 29 2020, 07:55 AM
To fix this issue Here's a NumPy based solution to your problem.
First we store rating data into an array:
code :
fruits = np.asarray(['Apple', 'Orange', 'Pear', 'Grape', 'Melon'])
M = np.asarray(data.loc[:, fruits])
M_u = M.mean(axis=1)
item_mean_subtracted = M - M_u[:, None]
similarity_matrix = 1 - squareform(pdist(item_mean_subtracted.T, 'cosine'))
indices = np.fliplr(np.argsort(similarity_matrix, axis=1)[:,:-1])
result = np.hstack((fruits[:, None], fruits[indices]))
In [49]: M
Out[49]: 
array([[ 0, 10,  0,  1,  0],
       [ 6,  0,  0,  0,  2],
       [ 1,  0, 20,  0,  1],
       [ 0,  3,  6,  0, 18],
       [ 3,  0,  2,  0,  0],
       [ 0,  2,  0,  5,  0]])

In [50]: np.set_printoptions(precision=2)

In [51]: similarity_matrix
Out[51]: 
array([[ 1.  ,  0.01, -0.41,  0.48, -0.44],
       [ 0.01,  1.  , -0.57,  0.37, -0.26],
       [-0.41, -0.57,  1.  , -0.56, -0.19],
       [ 0.48,  0.37, -0.56,  1.  , -0.51],
       [-0.44, -0.26, -0.19, -0.51,  1.  ]])

In [52]: result
Out[52]: 
array([['Apple', 'Grape', 'Orange', 'Pear', 'Melon'],
       ['Orange', 'Grape', 'Apple', 'Melon', 'Pear'],
       ['Pear', 'Melon', 'Apple', 'Grape', 'Orange'],
       ['Grape', 'Apple', 'Orange', 'Melon', 'Pear'],
       ['Melon', 'Pear', 'Orange', 'Apple', 'Grape']], 
      dtype='|S6')
Problem applying UDF cosine similarity to grouped ML vectors in Pyspark

Problem applying UDF cosine similarity to grouped ML vectors in Pyspark


By : user2437352
Date : March 29 2020, 07:55 AM
Hope that helps That's because Spark SQL doesn't support NumPy types. You should convert values to float before returning
code :
@F.udf(ArrayType(DoubleType()))
def dot_group(M):
    combs = combinations(M, 2)
    return [
        # or float(i.dot(j) / (LA.norm(i) * LA.norm(j)))
        (i.dot(j) / (LA.norm(i) * LA.norm(j))).tolist()
        for i, j in combs
    ]
Search the similarity of 2 strings in java using part of word matching, not cosine similarity

Search the similarity of 2 strings in java using part of word matching, not cosine similarity


By : user3480692
Date : March 29 2020, 07:55 AM
hop of those help? For each search string, split it into words using haystack.split("\\s+") (\\s+ is regexp-ese for 'the strings are separated by whitespace').
Then, to obtain a 'score' you need 2 numbers: How many words matched, and how many words there are total. You sort descending on first, and ascending on last, which gets you the behaviour you seem to want.
code :
String[] needle = "super cold white snow".split("\\s+");
String[] haystack = "white image superdupercold".split("\\s+");
int matchedWords = 0, totalWords = haystack.length;
for (String n : needle) {
    boolean found = false;
    for (String hay : haystack) {
        if (hay.contains(n)) {
            found = true;
            break;
        }
    }
    if (found) matchedWords++;
}
private static final long MULTIPLIER = 0x100000000L;
long score = MULTIPLIER * matchedWords + (Integer.MAX_VALUE - totalWords);
@Value
class Result { String needle; int words, total; }

list.sort(
    Comparator.comparing(Result::getWords).reversed().
    thenComparing(Comparator.comparing(Result::getTotal));

list.stream().map(Result::getNeedle).forEach(System.out::println);
Related Posts Related Posts :
  • How to get MQTT subscriptions
  • Android Google Sign in Exceptions
  • JavaFX - method that waits for user input
  • Replacing values for a particular key in treemap changes values for every key
  • This method call passes a null value for a nonnull method parameter. Either the parameter is annotated as a parameter th
  • Kafka: consume all messages on demand
  • Notify what text was changed in textview
  • Tinkerpop/Gremlin: select vertices together with outgoing edge count
  • transform a list of objects into a list of integers that pass a check
  • Why this java code is showing strange behavior?
  • Maven Project classes not compiling
  • Edit image to make text more clear opencv
  • Android Spinner nullpointer
  • Add result to int array every time you finish counting the occurrence
  • Android import java library
  • How to use LDAP Authentication in a corporate environment
  • adding item during iteration in java special usecase
  • How can I sort a map with string key? like (1 foo , 2 foo)
  • How to test an implementation of TLS based on SSLEngine?
  • Sorting a Linked List in alphabetical order
  • Can't make more than one request on java.net.http.HttpClient or will receive: javax.net.ssl.SSLHandshakeException
  • Java logic - strange things happening in while loop
  • Unable to replicate an encryption method from Java to PHP using AES/ECB/PKCS5Padding
  • How program arithmetics on general objects which become defined only later?
  • Make asynchronous call synchronous in Kotlin
  • Thread Pool with Spring @Value for Pool size doesnt run properly
  • How to substring before nth occurence of a separator?
  • Resume S3 multipart upload: PartETag
  • Mutable fields should not be "public static"
  • How to remove fragment from URI
  • Add a node in xml using java only if the node has an attribute
  • java streams: elegant way to filter according an exception is thrown
  • Exception in thread, java code wont execute properly. If/else statments
  • using the same UI instance inside access method
  • Single Linked List in reverse order Java
  • Is there a java api that will identify the ipv6 address fd00:: as local/private?
  • Clear a cache with JSR-107 annotations
  • Store data from an array for later comparison
  • Why BufferedWriter here is not writing to the file even though I close it in the end?
  • Three methods to search: name; name and surname; and age in an array
  • When do I have to create new instance while using @Autowired
  • ArrayList public constructor - "Constructs an empty list with an initial capacity of ten" - where?
  • How to make assignable, number-like class in Java?
  • How to make a query in Firebase similar to a SQL query?
  • How to inflate SupportActionBar menu from outside Activity
  • Is it good practice using two navigation (component architecture) in the same app?
  • Can a method annotation handle errors thrown by this method?
  • How does the javadoc know which method each documentation comment is for?
  • Refactoring/moving java package between intellij projects
  • Why lists pksc11-tool seven Objects but Java Keystore has Only one
  • How to update an entity with an element-collection of type Map<String,String>?
  • How to solve a "1 producer 2 consumers" problem in java
  • java selenium - Nav to a page number not shown on a Paginated Table
  • Add dependecies to runtime image with Gradle
  • How to reload application.properties in runtime which is not part of jar
  • How to stream, in java, over a key->collection map, where each streamed element is the key and every element in the c
  • Android Listview - Load each row on thread
  • Vaadin: value from DateField is null after conversion
  • Payara 5 compatibility with Java 9
  • Deleting an item from RecyclerView + SQLite
  • shadow
    Privacy Policy - Terms - Contact Us © bighow.org