Highest Voted 'multimodal' Questions

4

votes

0 answers

can't change embedding dimension to pass it through gpt2

I'm practicing image captioning and have some problems with different dimensions of tensors. So I have image embedding aka size [1, 512], but GPT2, which I use for caption generation, needs size [n, 768], where n is number of tokens of the caption's…

asked Jan 03 '23 at 17:50

kat0ewww

41
1

3

votes

0 answers

How to pass one data array per model input in multimodal deep autoencoder?

i'm working on a deep multimodal autoencoder for dimensionality reduction and i'm following this code (https://wizardforcel.gitbooks.io/deep-learning-keras-tensorflow/8.2%20Multi-Modal%20Networks.html) from keras.layers import Dense, Input from…

python keras model autoencoder multimodal

asked Jul 17 '20 at 16:43

Andrea

113
1
7

2

votes

1 answer

What method and tool for regression analysis for a multimodal distribution in R?

I have a set of variables X1 and X2 and Y with relationship plot as shown below. X2 values are used for color coding. X1, X2, and X3 are integer variables. The observed pattern is multimodal. What is the best way to predict Y based on X1 and…

r regression non-linear-regression multimodal

asked Feb 14 '22 at 20:18

vp_050

583
2
4
16

1

vote

0 answers

How to combine multiple images with one signal data in a dataset (Python/PyTorch/MultiModal)

I want to build a multimodal model, for every signal sequence i have several pictures. Example: For example i have 10 images that correspond to 5sec force data, which i want to combine in one batch. That means i want to build a model where those 10…

python image pytorch torch multimodal

asked Dec 12 '22 at 13:30

SunIsGod

11
2

1

vote

1 answer

get contrastive_logits_per_image with flava model using huggingface library

I have used a code of Flava model from this link: https://huggingface.co/docs/transformers/model_doc/flava#transformers.FlavaModel.forward.example But I am getting the following error: 'FlavaModelOutput' object has no attribute…

python-3.x image-processing huggingface-transformers bert-language-model multimodal

asked Aug 06 '22 at 06:40

lazytux

157
5

1

vote

1 answer

prediction logits using lxmert with hugging face library

how can we get the prediction logits in the lxmert model using hugging face library? It's fairly easy to get in visualbert, but I'm not able to get it with the lxmert model. In case of visualbert model, the keys I'm getting are…

python image-processing huggingface-transformers bert-language-model multimodal

asked Jul 22 '22 at 14:46

lazytux

157
5

1

vote

0 answers

Are there any alternatives to COVAREP in python?

I find that many multimodal sentiment analysis datasets(like CMU-MOSI) use the COVAREP to extract the audio features(74-dimensions). But i'm not familiar with Matlab. So, i wonder if there are some way for me to get the same features as COVAREP…

audio nlp sentiment-analysis speech multimodal

asked Jun 22 '22 at 08:52

junyi chen

11
1

1

vote

0 answers

how can we apply masked language modelling on the images using multimodal models? How can we implement such a thing and get MLM scores?

It might not be clear from the question what I want to say, but how can we apply masked language modelling with the text and image given using multimodal models like lxmert. For example, if there is some text given (This is a MASK) and we mask some…

image-processing huggingface-transformers bert-language-model transformer-model multimodal

asked May 30 '22 at 16:17

lazytux

157
5

1

vote

0 answers

Layer "model" expects 2 input(s), but it received 1 input tensors

I built a vqa model, and set two inputs(images, questions). It was well trained with train/val datasets, but with test_dataset, it keep printing errors like below; ValueError: Layer "model" expects 2 input(s), but it received 1 input tensors. Inputs…

multimodal

asked Mar 27 '22 at 09:16

Seonwoo Kim

11
1

1

vote

0 answers

Detect multimodal distribution and split the data in R

I have a data with more than 10000 distributions looking like the ones in red. I want to compare each one of them with a reference distribution like the one in blue. Because some are unimodal and some are multimodal I cannot use a t-test for all of…

r distribution normal-distribution multimodal

asked Nov 16 '21 at 21:16

RCchelsie

111
6

1

vote

0 answers

Test differences in multimodal distributions for different groups in R or Python

I am analyzing data from 3 different gait speeds. For each group/speed, I am determining specific value called "angle". Each group has different sample size. So, I need to compare multimodal distributions and I would like to statistically test…

distribution kernel-density hypothesis-test kruskal-wallis multimodal

asked Nov 09 '21 at 00:59

Dori

53
4

1

vote

2 answers

How to use the modal in the list in react native (a specific Modal for each list item)?

I made a customized list component (in React Native) which shows touchable images with some description texts. I need each images open a specific Modal; but I don't know how!! where & how I should code the Modal?? ... here is my photo list…

react-native modal-dialog image-gallery flatlist multimodal

asked Sep 04 '21 at 07:18

Unico

93
11

1

vote

1 answer

Plot unimodal distributions determined from a multimodal distribution

I've used GaussianMixture to analyze a multimodal distribution. From the GaussianMixture class I can access the means and covariances using the attributes means_ and covariances_. How can I use them to now plot the two underlying unimodal…

python distribution gaussian multimodal

asked Mar 14 '21 at 15:24

riyansh.legend

117
1
13

1

vote

0 answers

How to implement three-way clustering in python

I am relatively a learner in the field of datascience. Recently I came across these concepts and I am really keen to implement them - i.e. the concept of multimodal clustering applications. (From here I got the idea -…

cluster-analysis multidimensional-cube multimodal

asked Mar 12 '21 at 12:43

K C

413
4
15

1

vote

0 answers

Can pre-trained ResNet50 be used for very low resolution image?

It's required to find the best image given with a text description. However, the resolution is very low, i.e., 50 x 50 pixels. In this case, can pre-trained ResNet50 be used? or any recommendations on a better architecture? Thanks!

deep-learning resnet multimodal

asked Jun 15 '20 at 02:56

HappyCoding

5,029
7
31
51

Questions tagged [multimodal]