Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Mapping parser exception when using array fields (multi-valued) to vectorize #245

Closed
juntezhang opened this issue Aug 8, 2023 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@juntezhang
Copy link

What is the bug?

The neural search plugin returns a mapping parser exception when vectorizing a multi-valued field, i.e. a field with an array of values.

This is an example error that is being returned now:

{
    "took": 2,
    "ingest_took": 126,
    "errors": true,
    "items": [
        {
            "index": {
                "_index": "cbd49d4f7d2f4d378b4cf57b45836e65_standard.v1",
                "_id": "4",
                "status": 400,
                "error": {
                    "type": "mapper_parsing_exception",
                    "reason": "failed to parse field [_fulltext_vectorized] of type [knn_vector] in document with id '4'. Preview of field's value: '{knn=[0.0022234858, -0.035359997, -0.04597883, -0.04428486, -0.051025208, 0.07164392, 0.01914906, 0.019771628, 0.040593114, -0.10260139, -0.03696316, -0.026142513, 0.017337762, -0.07099895, 0.06488337, 0.011845544, 0.02151472, -0.0187775, 0.046898145, -0.0057110838, 0.029207353, 0.06773821, 0.022459025, 0.05387708, 0.017910926, 0.017033821, 0.009532835, -0.014977801, -0.08386825, -0.020387044, 0.0028280425, -0.0011476644, 0.009472969, 0.05817717, -0.029048974, 0.0016374304, 0.004514778, -0.026373068, 0.013422251, 0.027697867, -0.045738325, 0.009399452, 0.041691214, 0.020636152, -0.0015610099, -0.035841465, -0.014133858, -0.049274124, 0.019253967, -0.025773656, -0.022655146, -0.0443389, 0.016633136, -0.022475125, 0.024537688, -8.700431E-4, -0.0014111339, -0.044544637, 0.010789341, -0.007992201, -0.0040981085, 0.06470115, -0.05625269, 0.07036083, 0.013696916, 0.016641788, -0.039222337, -0.01709397, 0.022268193, -0.036803477, -0.005889197, -0.0017428886, 0.012373777, -0.0053222124, 5.630651E-4, -0.03114328, 0.022903379, 0.010015105, -0.09750444, -0.039831508, 0.026119227, 0.04916415, -0.03518284, -0.049163043, -0.058863427, 0.019288998, -0.004651093, -0.0068346174, -0.01554544, -0.0318574, 0.005209178, 0.022621633, 0.051216673, 0.060406737, -0.019851688, -0.020140702, -0.050286926, 0.04648004, 0.028480874, -0.0068861614, -0.038972907, 0.027410269, 0.039639413, 0.008856761, 0.04474063, 0.019777315, 0.04533569, -0.04286205, -0.051645227, 0.021571778, 0.026019301, 0.03279914, 0.007847676, 0.018339507, -0.025968133, -0.011049657, 0.012097716, -0.026544329, -0.017996104, 0.06355469, -0.044277992, -0.021543628, -0.05254539, 0.01735264, 0.09401051, -0.0014846673, 0.05555623, -0.056472104, 0.011484598, -0.006637214, 0.0029666661, -0.01887005, 0.015547031, 0.038509913, 0.05672793, -0.011875595, 0.031623192, -0.015370892, -0.005386455, -2.690006E-4, 0.015663855, -0.026425289, 0.014581022, -0.008019377, 0.009754192, -0.003575142, 4.3682655E-4, -0.0017491437, 0.07600901, -0.0075051626, 0.07084152, 0.032232024, 0.02919557, 0.00843389, 0.057466295, -0.0019165685, -0.047833677, 0.061958246, 0.0022890286, 0.04325089, -0.003123286, -0.064216696, -0.029908644, 0.026410121, 0.028926784, -0.009601688, -0.014144011, 0.0045824777, 0.022797842, 0.012890149, 0.001131833, 0.045201935, -0.008825862, -0.07638976, 0.027312782, 0.019408751, -0.03865251, 0.034349024, -0.0026372483, 0.030560758, -0.044272523, -0.012122316, 0.09661386, -0.02256945, 0.0062002423, 0.08038384, 0.00446424, 0.04955032, -0.0055512907, -0.0026584843, 0.033093628, -0.00980246, -0.0043738247, 0.0063798116, -0.021291338, 0.08625754, -0.03646296, -0.0487054, -0.011571014, -0.028801179, -0.025787387, 0.010490938, 0.003622506, -0.008190592, 0.0470385, -0.008248904, 0.0077942447, 0.020922707, -0.02686499, 0.10285625, 0.07073387, 0.010794978, -0.07248987, -0.095100045, 0.037754994, -0.021165807, -0.03498472, 0.05648951, 0.03388164, -0.029221494, -0.022435514, -9.771192E-4, -0.007607546, 0.03615137, -0.011542751, 0.006586234, -5.4524833E-4, 0.046416942, -0.017986912, -0.04261404, 0.024626905, -0.047040798, -0.033530287, -0.07259043, 0.035305493, -0.057256985, 0.0099811265, 0.026785806, -0.0038653421, -0.0112647405, -0.04575772, -0.008731651, -0.043923464, 0.0221416, 0.02560391, 0.0151665285, 0.050818447, 0.050663058, 0.04115714, 0.0023678623, -0.035306197, -0.084681004, 0.023927884, -0.029665923, 0.01732664, 0.023585321, -0.09153603, 0.011902792, 0.014034639, -0.016245406, -0.028288625, -0.033873256, 0.003353059, 0.053711176, 0.020320253, 0.047022935, 0.017852021, -0.004466688, 0.01584446, 0.013000825, 0.010487262, -0.01842222, 4.192322E-4, -0.023091855, -0.008465306, 0.006698411, 0.093931206, -0.0033818672, 0.010160068, 0.022345843, 0.060346037, 0.013339592, 0.022352727, -0.03324387, 0.04028923, -0.03625564, -0.014581585, -0.041865215, 0.0026519487, -0.05310946, 0.048063193, 0.0455302, -0.00788708, -0.015701849, -0.018672984, -0.020087173, 0.019534169, 0.0039749774, 0.027360147, -0.031662032, 0.030710638, 0.03512735, 0.048522025, -0.062308006, -0.040190976, -0.032376904, -0.007325561, 0.0063507366, -0.011564964, 0.020408038, -0.0019888023, -0.03727972, -0.012309679, 0.047042694, 0.034211647, -0.051655166, 0.035434086, 0.061370537, 0.008199342, 0.015067597, 0.014224768, -0.012992773, 0.0077665015, 0.0095752645, -0.041191015, -0.03864752, -0.045672048, -0.03465211, -0.035151068, 0.04701482, -0.109876476, -0.046479177, 0.005260139, 0.006503089, -0.024829691, 0.0110920165, 0.019567052, -0.0012819336, -0.0050430726, -0.020127827, -0.026641257, -0.014725637, 0.11150941, 0.060796894, 0.015541844, 0.0066707935, 0.008368974, -0.052622072, -0.04412639, 0.030551784, -0.026100276, -0.005876258, 0.005038192, -0.026032347, 0.008800204, 0.026139341, 5.213012E-4, 0.0065577067, -0.05505976, 0.0330918, -0.008326094, -0.0396188, 0.0025030947, 0.026069254, 0.023491057, 0.02984612, -0.014990682, -0.06625647, -0.0046483255, 0.02964674, 0.0015031297, -0.015714135, -0.02850169, -0.0055536265, 0.0066902246, -0.007848156, -7.6600706E-4, 0.04313042, -0.015721817, 0.005434864, 0.024475155, -0.03546163, 0.017684974, 0.010431305, -3.305352E-5, -0.09190294, 0.008938634, 0.04076338, 0.0041706664, -0.016136287, -0.05161017, 0.053476337, 0.015459943, -0.023108725, 0.039780494, 0.024200078, -0.0838633, 0.06269914, -0.05062816, -0.06510745, -0.0025781589, 0.07451038, -0.02091673, 0.044712655, -0.04894433, 0.028831981, 0.034012303, -0.0023153357, -0.04997018, -0.075276874, 0.030397471, 0.014266653, 0.02622722, 0.0047556623, -0.038664035, 0.0015079024, 0.0032861913, -0.03532062, 0.046805825, -0.019798208, -0.006433615, -0.018331818, 0.031093929, -0.049762033, 0.036538515, 0.0041088043, -0.028467398, 0.03458563, 0.012369647, 0.0052060084, -0.023429459, -0.0024426244, 0.001823067, 0.019299336, 0.017141998, 0.0031919046, -0.07006116, -0.041087616, -0.0058480343, 0.058516745, 0.032500196, 0.020187413, 0.006855557, -0.012927736, 0.0022162858, -0.06437811, -0.009692816, 0.0039992286, 0.05996473, -0.021586077, -0.040819686, -0.024930276, 0.042631, -0.10328431, 0.014596877, 0.05685491, -0.013609369, 0.016496915, -0.0033860805, -0.021132236, 0.037780456, -0.024468174, -0.009608023, -0.023516266, -0.03733314, -0.006833277, -0.016395053, 0.0070661553, -0.0020087238, -0.056234054, -0.0077887755, 0.01678384, -0.025155602, -0.023703402, 0.028658977, -0.012632824, -0.015815027, -0.11297324, 0.043369632, 0.02495586, 0.028703807, 0.03492561, -0.011649105, 0.031252336, -0.022072453, -0.04119532, 0.07378223, -0.02075963, 0.016274827, -0.013694347, -0.016307695, 0.001300256, -0.036652822, 0.034588035, 1.8075841E-33, -0.0013329218, -0.04732977, 0.00996674, -0.029841881, -0.038386185, -0.04705207, 0.009828712, -0.0120018525, 0.01957792, -0.002065031, -0.014865287, 0.0020658378, 0.0045970078, 0.018740244, -0.023252893, 0.022443427, -0.06343757, -0.025586, -0.0037992818, -0.03004964, 0.013667456, -0.018568411, -0.028990187, 0.009482019, -0.06301416, -0.013344209, 0.017539762, 0.003624506, 0.03288363, 0.01715692, -0.006730218, 0.03863204, -0.0082167955, -0.001636874, 0.046989333, 0.021004293, -0.006706133, -0.00930357, 0.045862682, -0.012914553, -0.037851233, -0.08964776, -0.03390127, 0.044933114, -0.031888396, -0.07620988, -0.0020800682, 0.012426403, -0.07331452, 0.0076321703, -0.040095836, -0.038698476, 0.0017401874, 0.024982955, 0.0014869848, 0.028290188, -0.07428447, 0.034169685, -0.079832755, -0.038579136, 0.008933051, -0.006873085, 0.07893267, -0.01009624, 0.022934059, 0.009754162, 0.014792834, 0.060311176, 0.03243314, -0.006624844, -0.034329288, -0.007510828, 0.050048847, -0.004304317, 0.02510955, 0.021710811, 0.0307444, -0.0021874288, 0.015229203, 0.047972646, -0.039895188, 0.012476255, 0.02703852, 0.09733716, 0.06299362, 0.010616695, -0.0046062027, 0.044824835, 0.0061520166, -0.006623393, 0.09986233, 0.02122441, 0.055295713, 0.001532098, 0.012163582, 0.0018900193, 1.3753287E-4, 0.015736612, -0.0063640936, 0.02673144, 0.035468813, -0.0111199105, 0.037446886, 0.02064939, -0.0016271996, -0.0027657875, -0.0046730214, -0.034245532, -0.0066429153, -0.018573927, -0.015771288, -0.0057752146, 0.028279593, 0.052879356, 0.070884794, -0.012343979, -0.022187816, -0.027230598, -0.057736237, 0.017123338, 0.05645999, -0.029136037, 0.018418355, 0.03351841, 0.034862872, 0.065103404, -0.019825313, -0.018390888, -0.021154428, -0.027518576, 0.043791376, -0.0072945477, 0.055040803, 0.016504884, 0.035224862, -0.032629233, -0.014595143, -0.05553731, -0.0298001, 0.012567941, -0.017359268, 0.07743764, -0.019259855, 0.028261904, -0.0067304126, -0.00778184, 0.035145696, -0.03499496, 0.043350633, 0.032245047, 0.057637613, -0.008384977, 0.025610931, -0.038622297, -0.076179355, -0.011728838, -0.052147277, -0.027141817, 0.0031384584, 0.057403617, -0.03149885, -0.07207539, -0.031181404, 0.034629732, 0.0046421094, 0.05123516, 0.048581116, 0.0026500921, 0.0069391984, -0.03898162, -0.057580877, -0.01872841, -0.023232281, 0.008946941, 0.019862209, 0.03084452, -0.02284944, -0.04255101, 0.018957136, -0.007989339, -0.015704466, 0.02101821, -0.010253275, -0.09247238, 0.006889824, -0.050714917, -0.013169677, 0.010791484, -0.01209312, 0.052326225, -0.015788112, -0.03533834, 0.018179337, -0.02753798, 0.03551319, -0.008871127, -0.0095051015, 0.0124840215, 0.046523266, 0.04466758, 0.03454243, 0.03360791, 0.010210552, -0.015331205, -0.0956788, -0.0251008, -0.014605966, -0.060167354, 0.004341411, -0.00951521, 0.042475585, 0.011100874, -0.03497262, -0.043598466, -0.00374743, -0.050682824, 0.07120098, 0.028575115, 0.014855652, 0.055505354, -0.020940848, -0.02267042, -0.007144853, 0.014108742, 0.023034291, -0.013486119, -0.0065789875, -0.04207302, -0.05829398, 0.030400414, -0.04194086, -0.041272007, 0.008843035, 0.057984617, 0.0541426, 0.007082873, 0.050124623, 0.05790826, 0.07747711, 0.030769683, -0.036104802, -0.030300677, -0.041837405, 0.024292568, 0.054189228, -0.015752152, 0.0042987284, 0.012297104, -0.059740506, 0.024804313, -0.01837132, 0.027748019, -0.021990905, 0.020890879, -7.413028E-4, -0.02815649, -0.017462464, 0.022596458, -0.0059415614, 0.04076583, 0.04511265, 0.08606662, 0.045326464, -0.020486556, 0.027558114, -0.063122466, -0.04743531, -0.022816103, -0.013700926, 0.015591036, -0.065042675, 0.005773686, -0.029538473]}'",
                    "caused_by": {
                        "type": "json_parse_exception",
                        "reason": "Current token (START_OBJECT) not numeric, can not use numeric value accessors\n at [Source: (byte[])\"{\"_fulltext_source\":[\"cowboy\"],\"_fulltext_vectorized\":[{\"knn\":[0.0022234858,-0.035359997,-0.04597883,-0.04428486,-0.051025208,0.07164392,0.01914906,0.019771628,0.040593114,-0.10260139,-0.03696316,-0.026142513,0.017337762,-0.07099895,0.06488337,0.011845544,0.02151472,-0.0187775,0.046898145,-0.0057110838,0.029207353,0.06773821,0.022459025,0.05387708,0.017910926,0.017033821,0.009532835,-0.014977801,-0.08386825,-0.020387044,0.0028280425,-0.0011476644,0.009472969,0.\"[truncated 9087 bytes]; line: 1, column: 92]"
                    }
                }
            }
        }
    ]
}

How can one reproduce the bug?

Add in the mapping the vectorized field configuration as explained in the tutorial of Sease.

Create an ingest pipeline with this in the request body:

{
    "neural_pipeline": {
        "description": "Neural search pipeline",
        "processors": [
            {
                "text_embedding": {
                    "model_id": "<MODEL_ID>",
                    "field_map": {
                        "_fulltext_source": "_fulltext_vectorized"
                    }
                }
            }
        ]
    }
}

Then index the following document to the Bulk API:

    {"index":{"_id":4}}
    {"_fulltext_source":["cowboy"]}

It consists of a field with as value an array with 1 value.

What is the expected behavior?

The expected behavior is that it would vectorize all values in the array and use these values in the ANN search.

What is your host/environment?

MacOS 13.4.1, but running in Docker with latest Ubuntu.

Do you have any screenshots?

N/A

Do you have any additional context?

I am happy to contribute to a solution.

@juntezhang juntezhang added bug Something isn't working untriaged labels Aug 8, 2023
@navneet1v
Copy link
Collaborator

@zane-neo can you please look into this issue.

@zane-neo
Copy link
Collaborator

@juntezhang I can reproduce this issue with index setting like below:

{
    "settings": {
        "index": {
            "knn": true,
            "knn.algo_param.ef_search": 100,
            "refresh_interval": -1,
            "default_pipeline": "my-pipeline"
        },
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "properties": {
            "text_knn": {
                "type": "knn_vector",
                "dimension": 384,
                "method": {
                    "name": "hnsw",
                    "space_type": "l2",
                    "engine": "nmslib",
                    "parameters": {
                        "ef_construction": 128,
                        "m": 24
                    }
                }
            }
        }
    }
}

With this index setting, you'll get mapping exception indeed because the created knn vectors of the input list are encapsulated into a nested list. Please try to create your index like below:

{
    "settings": {
        "index": {
            "knn": true,
            "knn.algo_param.ef_search": 100,
            "refresh_interval": -1,
            "default_pipeline": "my-pipeline"
        },
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "properties": {
            "text_knn": {
                "type": "nested",
                "properties": {
                    "knn": {
                        "type": "knn_vector",
                        "dimension": 384,
                        "method": {
                            "name": "hnsw",
                            "space_type": "l2",
                            "engine": "nmslib",
                            "parameters": {
                                "ef_construction": 128,
                                "m": 24
                            }
                        }
                    }
                }
            }
        }
    }
}

And try again, thanks.

@zane-neo
Copy link
Collaborator

@juntezhang Closing this issue, if you still facing errors, please let us know.

@juntezhang
Copy link
Author

I can confirm that this solutions works! Sorry for the late response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants