3,795 questions
Tooling
1
vote
3
replies
62
views
Email Parser with signature extraction in Python
I am trying to create an email Parser which can split the email thread in .msg format into messages and can extract signatures from those messages and emails from the From: section.
I am able to split,...
0
votes
0
answers
96
views
Binary incompatibility with Numpy [duplicate]
I am running a Kaggle notebook with a T4 GPU with the following code:
%%capture
!pip uninstall -y transformers datasets accelerate peft huggingface_hub bitsandbytes sentence-transformers faiss-gpu
!...
0
votes
0
answers
32
views
Using gold POS tags in SPACY to influence Dependency Parsing at inference time
I’m working with spaCy v3.x on Italian product titles and I have trained a custom POS tagger and dependency parser on a domain-specific dataset.
In my use case, I sometimes need to override or correct ...
Advice
0
votes
1
replies
65
views
Organisation/Person tagging using Spacy
We’re working on a problem where our master dataset contains names of organizations and individuals, but some entries are untagged. We only have the names (no additional details such as email or ...
1
vote
1
answer
69
views
Output of for loop filling down in dataframe instead of returning corresponding values for each row
I'm using SpaCy to process a series of sentences and return the five most common words in each sentence. My goal is to store the output of that frequency analysis (using Counter) in a column beside ...
0
votes
0
answers
56
views
Training data format for SpanCategorizer when using custom suggester function
I'm taking a stab at building my own claim extraction pipeline (first time spaCy user).
Upstream in my pipeline, I feed n amount of docs to NER in the en_core_web_sm pretrained model in order to ...
0
votes
0
answers
66
views
Training with spaCy from command line, don't know why gpu-id not recognized
I am having the hardest of times getting my training session to use my gpu 0 which by every measure is present and correctly setup with cuda 12.2.
When I try to do python -m spacy train base_config....
0
votes
1
answer
352
views
How to make Microsoft Presidio detect and mask Indian names and unusual text patterns in banking data?
I’m working on anonymizing PII in banking text using Microsoft Presidio
.
The built-in PERSON recognizer (which uses spaCy under the hood) works for some Western names and when the sentence is clear
...
1
vote
1
answer
95
views
How can I extract symptoms/diseases from a running transcription?
I'm working on a project where I'm attempting to extract medical symptoms from a running transcription. I'm using SocketIO to get mic audio and then using Whisper to transcribe the audio into text ...
1
vote
2
answers
129
views
how to efficiently use spacy for pos tagging and ner
I am having 200 documents and I want to do NER and pos_tagging. However I find spacy to be too slow(I am running this code in google colab):
for doc in nlp.pipe(dataset["text"], batch_size=...
0
votes
0
answers
66
views
spaCy spancat won’t learn (zero F-score) while NER on same data scores 0.40 — Prodigy-generated KPI/target corpus
I am traing to train a spaCy v 3.8.7 spancat model on ~100 sustainability reports (annotated with Prodigy) to extract KPIs and targets.
An NER pipeline trained on the same data reaches F≈0.40, but ...
0
votes
1
answer
234
views
Unable to install spacy on MacOS 15.5 (M2) with Python 3.13.3 [duplicate]
Having created a new venv I am attempting to install spacy strictly in accordance with the documentation
Specifically:
pip install -U pip setuptools wheel
pip install -U 'spacy[apple]'
This fails (...
-2
votes
2
answers
861
views
pip install spacy errors with Python 3.13
I'm new to Python and I was given this code by my professor which includes "import spacy" and when I run the code I get the line: ModuleNotFoundError: No module named 'spacy'
That's where I ...
0
votes
0
answers
28
views
spaCy DependencyMatcher: One head for multiple children
How can I extract a single noun that is the head of multiple children?
I'm facing an issue in dependency matching in spaCy. I want to extract the nouns describing the name entities (identified by ...
4
votes
0
answers
69
views
Can older spaCy models be ported to future spaCy versions?
The latest spaCy versions have better performance and compatibility for GPU acceleration on Apple devices, but I have an existing project that depends on spaCy 3.1.4 and some of the specific behavior ...